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' Abstract 

In a wireless network with a single source and a single destination and an arbitrary number of 

m 

^ ■ relay nodes, what is the maximum rate of information flow achievable? We make progress on this long 

q!^ ' standing problem through a two-step approach. First we propose a deterministic channel model which 
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\^ , characterization of the capacity of a network with nodes connected by such deterministic channels. 
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captures the key wireless properties of signal strength, broadcast and superposition. We obtain an exact 



This result is a natural generalization of the celebrated max-flow min-cut theorem for wired networks. 
Second, we use the insights obtained from the deterministic analysis to design a new quantize-map-and- 
forward scheme for Gaussian networks. In this scheme, each relay quantizes the received signal at the 
noise level and maps it to a random Gaussian codeword for forwarding. We show that, in contrast to 
existing schemes, this scheme can achieve the cut-set upper bound to within a gap which is independent 
of the channel parameters. In the case of the relay channel with a single relay as well as the two-relay 
Gaussian diamond network, the gap is 1 bit/s/Hz. Moreover, the scheme is universal in the sense that 
the relays need no knowledge of the values of the channel parameters. We also present extensions of 
the results to multicast networks, half-duplex networks and ergodic networks. 

I. Introduction 
Two main distinguishing features of wireless communication are: 

• broadcast, wireless users communicate over the air and signals from any one transmitter 
are heard by multiple nodes with possibly different signal strengths. 
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• superposition: a wireless node receives signals from multiple simultaneously transmitting 
nodes, with the received signals all superimposed on top of each other. 

Because of these effects, links in a wireless network are never isolated but instead interact in 
seemingly complex ways. On the one hand, this facilitates the spread of information among users 
in a network; on the other hand it can be harmful by creating signal interference among users. 
This is in direct contrast to wired networks, where transmitter-receiver pairs can be thought of as 
isolated point-to-point links. Starting from the max-flow-min-cut theorem of Ford-Fulkerson [HI, 
there has been significant progress in understanding network flow over wired networks. Much 
less, however, is known for wireless networks. 

The linear additive Gaussian channel model is a commonly used model to capture signal 
interactions in wireless channels. Over the past couple of decades, capacity study of Gaussian 
networks has been an active area of research. However, due to the complexity of the Gaussian 
model, except for the simplest networks such as the one-to-many Gaussian broadcast channel 
and the many-to-one Gaussian multiple access channel, the capacity of most Gaussian networks 
is still unknown. For example, even the capacity of a Gaussian single-relay network, in which 
a point to point communication is assisted by one relay, has been open for more than 30 years. 
In order to make progress on this problem, we take a two-step approach. We first focus on the 
signal interaction in wireless networks rather than on the noise. We present a new deterministic 
channel model which is analytically simpler than the Gaussian model but yet still captures 
three key features of wireless communication: channel strength, broadcast, and superposition. A 
motivation to study such a model is that in contrast to point-to-point channels where noise is the 
only source of uncertainty, networks often operate in the interference-limited regime where the 
noise power is small compared to signal powers. Therefore, for a first level of understanding, our 
focus is on such signal interactions rather than the background noise. Like the Gaussian model, 
our deterministic model is linear, but unlike the Gaussian model, operations are on a finite-field. 
The simplicity of scalar finite-field channel models has also been noted in We provide a 
complete characterization of the capacity of a network of nodes connected by such deterministic 
channels. The first result is a natural generalization of the max-flow min-cut theorem for wired 
networks. 

The second step is to utilize the insights from the deterministic analysis to find "approximately 
optimal" communication schemes for Gaussian relay networks. The analysis for deterministic 
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networks not only gives us insights for potentially successful coding schemes for the Gaussian 
case, but also gives tools for the proof techniques used. We show that in Gaussian networks, 
an approximate max-flow min-cut result can be shown, where the approximation is within an 
additive constant which is universal over the values of the channel parameters (but could depend 
on the number of nodes in the network). For example, the additive gap for both the single- 
relay network and for the two-relay diamond network is 1 bit/s/Hz. This is the first result we 
are aware of that provides such performance guarantees on relaying schemes. To highlight the 
strength of this result, we demonstrate that none of the existing strategies in the literature, 
like amplify-and-forward, decode-and-forward and Gaussian compress-and-forward, yield such 
a universal approximation for arbitrary networks. Instead, a scheme, which we term quantize- 
map-and-forward, provides such a universal approximation. 

In this paper we focus on unicast and multicast communication scenarios. In the unicast 
scenario, one source wants to communicate to a single destination. In the multicast scenario 
source wants to transmit the same message to multiple destinations. Since in these scenarios, all 
destination nodes are interested in the same message, there is no interference between different 
information streams in the network. There is only one information stream. Due to the broadcast 
nature of the wireless medium, multiple copies of a transmitted signal are received at different 
relays and superimposed with other received signals. However, since they are all a function of 
the same message, they are not considered as interference. In fact, the quantize-map-and-forward 
strategy exploits this broadcast nature by forwarding all the available information received at 
the various relays to the final destination. This is in contrast to more classical approaches of 
dealing with simultaneous transmissions by either avoiding them through transmit scheduling or 
treating signals from all nodes other than the intended transmitter as interference adding to the 
noise floor. These approaches attempt to convert the wireless network into a wired network but 
are strictly sub-optimal. 

A. Related Work 

In the literature, there has been extensive research over the last three decades to characterize 
the capacity of relay networks. The single-relay channel was first introduced in 1971 by van 
der Meulen |[3l and the most general strategies for this network were developed by Cover and 
El Gamal H. There has also been a significant effort to generalize these ideas to arbitrary 
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multi-relay networks with simple channel models. An early attempt was done in the Ph.D. 
thesis of Aref jSl where a max-flow min-cut result was established to characterize the unicast 
capacity of a deterministic broadcast relay network without superposition. This was an early 
precursor to network coding which established the multicast capacity of wired networks, a 
deterministic capacitated graph without broadcast or superposition |l6l, [|7]|, |[8l. These two ideas 
were combined in BH, which established a max-flow min-cut characterization for multicast flows 
for "Aref networks". However, such complete characterizations are not known for arbitrary (even 
deterministic) networks with both broadcast and superposition. One notable exception is the work 
[fTOl which takes a scalar deterministic linear finite-field model and uses probabilistic erasures 
to model channel failures. For this model using results of erasure broadcast networks [[TT|. they 
established an asymptotic result on the unicast capacity as the field size grows. However, in 
all these works there is no connection between the proposed channel model and the physical 
wireless channel. 

There has also been a rich body of literature in directly tackling the noisy relay network 
capacity problem. In [[T2| the "diamond" network of parallel relay channels with no direct 
link between the source and the destination was examined. Xie and Kumar generalized the 
decode-forward encoding scheme for a network of multiple relays [13]. Kramer et al. [T4\ also 
generalized the compress-forward strategy to networks with a single layer of relay nodes. Though 
there have been many interesting and important ideas developed in these papers, the capacity 
characterization of Gaussian relay networks is still unresolved. In fact even a performance 
guarantee, such as establishing how far these schemes are from an upper bound is unknown. In 
fact, as we will see in Section Ulll these strategies do not yield an approximation guarantee for 
general networks. 

Our results are connected to the concept of network coding in several ways. The most direct 
connection is that our results on the multicast capacity of deterministic networks are direct 
generalizations of network coding results jU, [|7]|, |[8l, ifTSl . lfT6l as well as Aref networks (Si, 
BH. The coding techniques for the deterministic case are inspired by and generalize the random 
network coding technique of [6] and the linear coding technique of |I71, |[8l, ifTTll . The quantize- 
map-and-forward technique proposed in this paper for the Gaussian wireless networks uses the 
insights from the deterministic framework and is philosophically the network coding technique 
generalized to noisy wireless networks. 
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B. Outline of the paper 

We first develop an analytically simple linear finite-field model and motivate it by connecting 
it to the Gaussian model in the context of several simple multiuser networks. We also discuss its 
limitations. This is done in Section |Ill This model also suggests achievable strategies to explore 
in Gaussian relay networks, as done in Section Hill where we illustrate the deterministic approach 
on several progressively more complex example networks. The deterministic model also makes 
clear that several well-known strategies can be in fact arbitrarily far away from optimality in 
these example networks. 

Section |IV] summarizes the main results of the paper. Section |V] focuses on the capacity 
analysis of networks with nodes connected by deterministic channels. We examine arbitrary 
deterministic channel model (not necessarily linear nor finite-field) and establish an achievable 
rate for an arbitrary network. For the special case of linear finite-field deterministic models, 
this achievable rate matches the cut-set bound, therefore exact characterization is possible. The 
achievable strategy involves each node randomly mapping the received signal to a transmitted 
signal, and the final destination solving for the information bits from all the received equations. 

The examination of the deterministic relay network motivates the introduction of a simple 
quantize-map-and-forward strategy for general Gaussian relay networks. In this scheme each 
relay first quantizes the received signal at the noise level, then randomly maps it to a Gaussian 
codeword and transmits ij]. In Section |VI] we use the insights of the deterministic result to 
demonstrate that we can achieve a rate that is guaranteed to be within a constant gap from the 
cut-set upper bound on capacity. As a byproduct, we show in Section IVIII that a deterministic 
model formed by quantizing the received signals at noise level at all nodes and then removing 
the noise is within a constant gap to the capacity of the Gaussian relay network. 

In Section IVIIIi we show that the quantize-map-and-forward scheme has the desirable property 
that the relay nodes do not need the knowledge of the channel gains. As long as the network 
can support a given rate, we can achieve it without the relays' knowledge of the channel gains. 
In Section IVIIIl we also establish several other extensions to our results, such as relay networks 
with half-duplex constraints, and relay networks with fading or frequency selective channels. 

'This is distinct from the compress and forward scheme studied in ID where the quantized value is to be reconstructed at the 
destination. Our scheme does not require the quantized values to be reconstructed, but just the source codeword to be decoded. 



June 4, 2010 



DRAFT 



6 



II. Deterministic modeling of wireless channel 

The goal of this section is to introduce the linear deterministic model and illustrate how we 
can deterministically model three key features of a wireless channel. 

A. Modeling signal strength 

Consider the real scalar Gaussian model for a point-to-point link, 

y = hx + z (1) 

where z ~ A/'(0, 1). There is also an average power constraint < 1 at the transmitter. 

The transmit power and noise power are both normalized to be equal to 1 and the channel gain 
h is related to the signal-to-noise ratio (SNR) by 



\h\ = Vsm. (2) 

It is well known that the capacity of this point-to-point channel is 

Cawgn = ^ log (1 + SNR). (3) 

To get an intuitive understanding of this capacity formula let us write the received signal in 
Equation ([U), y, in terms of the binary expansions of x and z. For simplicity assuming h, x and 
z are positive real numbers and x has a peak power constraint of 1, we have 

oo oo 

^^25i°gSNR^^^.)2-^+ J2 ^W2"'- (4) 

1=1 i=— oo 

To simplify the effect of background noise assume it has a peak power equal to 1. Then we can 
write 



oo 



y ^ 2^i°sSNR^^(^)2-^^^^W2-^ (5) 

1=1 1=1 

or, 

n oo 

y^2" J2 ^(^)2"' + Yl (^(^ + + z{i)) 2~* (6) 
1=1 1=1 

where n = [|logSNR]+. Therefore if we just ignore the 1 bit of the carry-over from the 
second summation (Xli^i (^(^ + n) + z{i)) 2^*) to the first summation (2" J2^^i x{i)2~'^) we can 
approximate a point-to-point Gaussian channel as a pipe that truncates the transmitted signal 
and only passes the bits that are above the noise level. Therefore think of transmitted signal x 
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Fig. 1. Pictorial representation of the deterministic model for point-to-point channel. 



as a sequence of bits at different signal levels, with the highest signal level in x being the most 
significant bit and the lowest level being the least significant bit. In this simplified model the 
receiver can see the n most significant bits of x without any noise and the rest are not seen at 
all. There is a correspondence between n and SNR in dB scale, 



n o [- log SNR]- 



(7) 



This simplified model, shown in Figure [B is deterministic. Each circle in the figure represents a 
signal level which holds a binary digit for transmission. The most significant n bits are received 
at the destination while less significant bits are not. 

These signal levels can potentially be created using a multi-level lattice code in the AWGN 
channel [TS]. Then the first n levels in the deterministic model represent those levels (in the 
lattice chain) that are above noise level, and the remaining are the ones that are below noise 
level. We can algebraically write this input-output relationship by shifting x down by g — n 
elements 



(8) 



where x and y are binary vectors of length q denoting transmit and received signals respectively 
and S is the g X g shift matrix, 

/ ■■• \ 
10 ■■■ 

1 ■■• . (9) 

\0 ■■■ 1 / 

The capacity of this deterministic point-to-point channel is n, where n = [|logSNR]+. This 
capacity is within |-bit approximation of the capacity of the AWGN channel. In the case of 
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complex Gaussian channel we set n = [logSNR]+ and we get an approximation within 1-bit of 
the capacity. 

B. Modeling broadcast 

Based on the intuition obtained so far, it is straightforward to think of a deterministic model 
for a broadcast scenario. Consider the real scalar Gaussian broadcast channel (BC). Assume 
there are only two receivers. The received SNR at receiver i is denoted by SNRj for z = 1,2 
(SNR2 < SNRi). Consider the binary expansion of the transmitted signal, x. Then we can 
deterministically model the Gaussian broadcast channel as the following: 

• Receiver 2 (weak user) receives only the most significant n2 bits in the binary expansion 
of X. Those bits are the ones that arrive above the noise level. 

• Receiver 1 (strong user) receives the most significant ni (rii > 722) bits in the binary 
expansion of x. Clearly these bits contain what receiver 2 gets. 

The deterministic model makes explicit the functioning of superposition coding and successive 
interference cancellation decoding in the Gaussian broadcast channel. The most significant 7^2 
levels in the deterministic model represent the cloud center that is decoded by both users, and 
the remaining rii — n2 levels represent the cloud detail that is decoded only by the strong user 
(after decoding the cloud center and canceling it from the received signal). 

Pictorially the deterministic model is shown in Figure [21 (a). In this particular example = 5 
and n2 = 2, therefore both users receive the two most significant bits of the transmitted signal. 
However user 1 (strong user) receives three additional bits from the next three signal levels of 
the transmitted signal. There is also the same correspondence between n and channel gains in 
dB: 

^ r^logSNR,l + , 2 = 1,2. (10) 

To analytically demonstrate how closely we are modeling the Gaussian BC channel, the 
capacity region of the Gaussian BC channel and the deterministic BC channel are shown in 
Figure [2] (b). As it is seen their capacity regions are very close to each other. In fact it is easy 
to verify that for all SNR's these regions are always within one bit per user of each other, that 
is, if (i?i,i?2) is in the capacity region of the deterministic BC then there is a rate pair within 
one bit component- wise of R2) that is in the capacity region of the Gaussian BC. However, 
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(a) Pictorial representation of 
tlie deterministic model for 
Gaussian BC 



"2 

ilog(l+SNR2) 




1 log(l +SNRi)"l 



(b) Capacity region of Gaussian BC (solid line) and 
deterministic BC (dashed line). 



Fig. 2. Pictorial representation of the deterministic model for Gaussian BC is shown in (a). Capacity region of Gaussian and 
deterministic BC are shown in (b). 



this is only the worst-case gap and in the typical case where SNRi and SNR2 are very different, 
the gap is much smaller than one bit. 

C. Modeling superposition 

Consider a superposition scenario in which two users are simultaneously transmitting to a 
node. In the Gaussian model the received signal can be written as 

y = hixi + h2X2 + z. (11) 

To intuitively see what happens in superposition in the Gaussian model, we again write the 
received signal, y, in terms of the binary expansions of xi, X2 and z. Assume xi, X2 and z are 
all positive real numbers smaller than one, and also the channel gains are 



hi = ^/Smi, i = l,2. (12) 

Without loss of generality assume SNR2 < SNRi. Then 

00 00 00 

y = 25i°sSNRi^^^^.^)2-» + 25i°sSNR2^^^^-)2-'+ ^ z{{)2-\ (13) 

1=1 i=l i=—oc 

To simplify the effect of background noise assume it has a peak power equal to 1. Then we can 
write 

00 00 00 

y = 25i°sSNRi ^^^^^^2-^ ^ 25i°sSNR2 ^^^^■^)2-i + ^ z{i)2-' (14) 

i=l 1=1 i=l 
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or, 

ni— 712 "2 

y ^ 2"! ^ a;i(2)2-^ + 2"2^(xi(i + ni-n2) + X2(z))2-^ 

i=l 1=1 

oo 

+ {xi{i + ni) + X2{i + n2) + z{i)) 2"* (15) 
1=1 

where rij = [| log SNRj] + for i = 1,2. Therefore based on the intuition obtained from the point- 
to-point and broadcast AWGN channels, we can approximately model this as the following: 

• That part of xi that is above SNR2 (xi{i), I < i < rii — 77-2) is received clearly without any 
contribution from X2. 

• The remaining part of Xi that is above noise level (xi{i), ni — n2 < i < ni) and that part 
of X2 that is above noise level {xi{i), I < i < ^2) are superposed on each other and are 
received without any noise. 

• Those parts of xi and X2 that are below noise level are truncated and not received at all. 
The key point is how to model the superposition of the bits that are received at the same 

signal level. In our deterministic model we ignore the carry-overs of the real addition and we 
model the superposition by the modulo 2 sum of the bits that are arrived at the same signal 
level. Pictorially the deterministic model is shown in Figure H] (a). Analogous to the deterministic 
model for the point-to-point channel, as seen in Figure [3l we can write 

y = S'l^^^xi © S'l-^^xa (16) 

where the summation is in F2 (modulo 2). Here Xi (z = 1, 2) and y are binary vectors of length 
q denoting transmitted and received signals respectively and S is a q x q shift matrix. The 
relationship between rij's and the channel gains is the same as in Equation (flOl) . 

Compared to the point-to-point case we now have interaction between the bits that are received 
at the same signal level at the receiver. We limit the receiver to observe only the modulo 
2 summation of those bits that arrive at the same signal level. This way of modeling signal 
interaction has two advantages over the simplistic collision model. First, if two bits arrive 
simultaneously at the same signal level, they are not both dropped and the receiver gets their 
modulo 2 summation. Second, unlike in the collision model where the entire packet is lost when 
there is collision, the most significant bits of the stronger user remain intact. This is reminiscent 
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Fig. 3. Algebraic representation of shift matrix deterministic model. 




(a) Pictorial representation of the (b) Capacity region of Gaussian MAC (solid line) and 

deterministic MAC. deterministic MAC (dashed line). 

Fig. 4. Pictorial representation of the deterministic MAC is shown in (a). Capacity region of Gaussian and deterministic MACs 
are shown in (b). 



of the familiar capture phenomenon in CDMA systems: the strongest user can be heard even 
when mukiple users simuhaneously transmit. 

Now we can apply this model to the Gaussian MAC, in which 

y = hixi + h2X2 + z (17) 

where z ~ CJ\f{0, 1). There is also an average power constraint equal to 1 at both transmitters. 
A natural question is how close is the capacity region of the deterministic model to that of the 
actual Gaussian model. Assume SNR2 < SNRi. The capacity region of this channel is known 
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to be the set of non-negative pairs R2) satisfying 



Ri < log(l + SNR,) 



2 = 



1,2 



(18) 



R1 + R2 < log(l + SNR1 + SNR2). 



(19) 



This region is plotted with solid line in Figure |4] (b). 

It is easy to verify that the capacity region of the deterministic MAC is the set of non-negative 
pairs {Ri,R2) satisfying 



where Ui = [logSNRj]+ for i = 1,2. This region is plotted with dashed line in Figure |4](b). In 
this deterministic model the "carry-over" from one level to the next that would happen with real 
addition is ignored. However as we notice still the capacity region is very close to the capacity 
region of the Gaussian model. In fact it is easy to verify that they are within one bit per user of 
each other. The intuitive explanation for this is that in real addition once two bounded signals 
are added together the magnitude can become as large as twice the larger of the two signals. 
Therefore the number of bits in the sum is increased by at most one bit. On the other hand in 
finite-field addition there is no magnitude associated with signals and the summation is still in 
the same field as the individual signals. So the gap between Gaussian and deterministic model 
for two user MAC is intuitively this one bit of cardinality increase. Similar to the broadcast 
example, this is only the worst case gap and when the channel gains are different it is much 
smaller than one bit. 

Now we define the linear finite-field deterministic model for the relay network. 

D. Linear finite-field deterministic model 

The relay network is defined using a set of vertices V. The communication link from node i 
to node j has a non-negative integer gain riij associated with it. This number models the channel 
gain in the corresponding Gaussian setting. At each time t, node i transmits a vector Xi[t] G 
and receives a vector yj[t] G where q = maxj j (?7,(jj)) and p is a positive integer indicating 
the field size. The received signal at each node is a deterministic function of the transmitted 



i?2 < "2 



(20) 



R1 + R2 < ni 



(21) 
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signals at the other nodes, with the following input-output relation: if the nodes in the network 
transmit xi[t], X2[t], . . . X7v[t] then the received signal at node j, 1 < i < is: 

y,[t] =5^S^-"-xat] (22) 

where the summations and the multiplications are in Fp. In this paper the field size is assumed 
to be two, p = 2, unless it is stated otherwise. 

E. Limitation: Modeling MIMO 

The examples in the previous subsections may give the impression that the capacity of any 
Gaussian channel is within a constant gap to that of the corresponding linear deterministic model. 
The following example shows that is not the case. 

Consider a 2 x 2 MIMO real Gaussian channel with channel gain values as shown in Figure 
[5] (a), where k is an integer larger than 2. The channel matrix is 

H = 2M ^ . (23) 

The channel gain parameters of the corresponding linear finite-field deterministic model are: 

nil = \\\og,\h,,\^Y=\\og^{2^-2^-^)Y = k (24) 
ni2 = n2i = n22 = Roga 2''] + = k (25) 
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Now let us compare the capacity of the MIMO channel under these two models for large values 
of k. For the Gaussian model, both singular values of H are of the order of 2*^. Hence, the 
capacity of the real Gaussian MIMO channel is of the order of 



However the capacity of the corresponding linear finite-field deterministic MIMO is simply 



Hence the gap between the two capacities goes to infinity as k increases. 

Even though the linear deterministic channel model does not approximate the Gaussian channel 
in all scenarios, it is still useful in providing insights in many cases, as will be seen in the next 
section. Moreover, its analytic simplicity allows an exact analysis of the relay network capacity. 
This in turns provides the foundation for the analysis of the Gaussian network. 



In this section we motivate and illustrate our approach. We look at three simple relay networks 
and illustrate how the analysis of these networks under the simpler linear finite-field deterministic 
model enables us to conjecture an approximately optimal relaying scheme for the Gaussian case. 
We progress from the relay channel where several strategies yield uniform approximation to more 
complicated networks where progressively we see that several "simple" strategies in the literature 
fail to achieve a constant gap. Using the deterministic model we can whittle down the potentially 
successful strategies. This illustrates the power of the deterministic model to provide insights 
into transmission techniques for noisy networks. 

The network is assumed to be synchronized, i.e., all transmissions occur on a common clock. 
The relays are allowed to do any causal processing. Therefore their current output depends only 
its past received signals. For any such network, there is a natural information-theoretic cut-set 
bound lfT9l . which upper bounds the reliable transmission rate R. Applied to the relay network, 
we have the cut- set upper bound C on its capacity: 



2 X -log(l + 12^) ^ 2k. 




(26) 



III. Motivation of our approach 



C 




(27) 
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(a) The Gaussian relay channel 



(b) The linear finite-field deterministic relay channel 



Fig. 6. The relay channel: (a) Gaussian model, (b) Linear finite-field deterministic model. 

where Ajj = {^l : S E ^l, D E ^l^} is all source-destination cuts. In words, the value of a 
given cut Q is the information rate achieved when the nodes in Q fully cooperate to transmit 
and the nodes in fi^ fully cooperate to receive. In the case of Gaussian networks, this is simply 
the mutual information achieved in a MIMO channel, the computation of which is standard. We 
will use this cut-set bound to assess how good our achievable strategies are. 

A. Single-relay network 

We start by looking at the simplest Gaussian relay network with only one relay as shown in 
Figure [6] (a). To approximate its capacity uniformly (uniform over all channel gains), we need 
to find a relaying protocol that achieves a rate close to an upper bound on the capacity for all 
channel parameters. To find such a scheme we use the linear finite-field deterministic model to 
gain insight. The corresponding linear finite-field deterministic model of this relay channel with 
channel gains denoted by usr, usd and nj^^ is shown in Figure |6] (b). It is easy to see that the 
capacity of this deterministic relay channel, C^^iay^ smaller than both the maximum number 
of bits that the source can broadcast, and the maximum number of bits that the destination can 
receive. Therefore 



relay 



(28) 




min {nsR, urd) , otherwise. 




(29) 
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Fig. 7. The x and y axis respectively represent thie chiannel gains from relay to destination and source to relay normalized by 
the gain of the direct link (source to destination) in dB scale. The z axis shows the value of the gap between the cut-set upper 
bound and the achievable rate of decode-forward scheme in bits/sec/Hz. 



It is not difficult to see that this is in fact the cut-set upper bound for the linear deterministic 
network. 

Note that Equation (|29|) naturally implies a capacity-achieving scheme for this deterministic 
relay network: if the direct link is better than any of the links to/from the relay then the relay 
is silent, otherwise it helps the source by decoding its message and sending innovations. In the 
example of Figure [6l the destination receives two bits directly from the source, and the relay 
increases the capacity by 1 bit by forwarding the least significant bit it receives on a level that 
does not overlap with the direct transmission at the destination. This suggests a decode-and- 
forward scheme for the original Gaussian relay channel. The question is: how does it perform? 
Although unlike in the deterministic network, the decode-forward protocol cannot achieve exactly 
the cut-set bound in the Gaussian nettwork, the following theorem shows it is close. 

Theorem 3.1: The decode-and- forward relaying protocol achieves within 1 bit/s/Hz of the 
cut-set bound of the single-relay Gaussian network, for all channel gains. 

Proof: See Appendix lAl ■ 

We should point out that even this 1-bit gap is too conservative for many parameter values. 
In fact the gap would be at the maximum value only if two of the channel gains are exactly 
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the same. This is rare in wireless scenarios. In Figure |7] the gap between the achievable rate of 
decode-forward scheme and the cut-set upper bound is plotted for different channel gains. 

The deterministic network in Figure[6](b) suggests that several other relaying strategies are also 
optimal. For example, compress-and-forward [4] will also achieve the cut-set bound. Moreover a 
"network coding" strategy of sending the sum (or linear combination) of the received bits is also 
optimal as long as the destination receives linearly independent equations. All these schemes can 
also be translated to the Gaussian case and can be shown to be uniformly approximate strategies. 
Therefore for the simple relay channel there are many successful candidate strategies. 

B. Diamond network 

Now consider the diamond Gaussian relay network, with two relays, as shown in Figure [8] (a). 
Schein introduced this network in his Ph.D. thesis [12} and investigated its capacity. However 
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the capacity of this network is still open. We would like to uniformly approximate its capacity. 

First we build the corresponding linear finite-field deterministic model for this relay network 
as shown in Figure [8] (b). To investigate its capacity first we relax the interactions between 
incoming links at each node and create the wired network shown in Figure |9l In this network 
there are two other links added, which are from S to S and from D to D. Since the capacities 
of these links are respectively equal to the maximum number of bits that can be sent by the 
source and maximum number of bits that can be received by the destination in the original linear 
finite-field deterministic network, the capacity of the wired diamond network cannot be smaller 
than the capacity of the linear finite-field deterministic diamond network. Now by the max-flow 
min-cut theorem we know that the capacity C'^iamond of '^he wired diamond network is equal to 
the value of its minimum cut. Hence 

^diamond < ^'diamond = ^in {max(nsyii , ^igAa ) , max(nAiD , nA2D),nsAi + riA^D, nsA2 + nA^o} ■ 

(30) 

As we will show in Section |Vl this upper bound is in fact the cut-set upper bound on the capacity 
of the deterministic diamond network. 

Now, we know that the capacity of a wired network is achieved by a routing solution. We can 
indeed mimic the wired network routing solution in the linear finite-field deterministic diamond 
network and send the same amount of information through non-interfering links from source to 
relays and then from relays to destination. Therefore the capacity of the deterministic diamond 
network is equal to its cut-set upper bound. 

A natural analogy of this routing scheme for the Gaussian network is the following partial- 
decode-and- forward strategy: 

1) The source broadcasts two messages, mi and m2, at rate Ri and R2 to relays Ai and A2, 
respectively. 

2) Each relay Ai decodes message m^, i = 1,2. 

3) Then Ai and A2 re-encode the messages and transmit them via the MAC channel to the 
destination. 

Clearly the destination can decode both mi and m2 if (i?i,i?2) is inside the capacity region 
of the BC from source to relays as well as the capacity region of the MAC from relays to the 
destination. The following theorem shows how good this scheme is. 
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Theorem 3.2: Partial-decode-and-forward relaying protocol achieves within 1 bit/s/Hz of the 
cut-set upper bound of the two-relay diamond Gaussian network, for all channel gains. 

Proof: See Appendix |B] ■ 

We can also use the linear finite-field deterministic model to understand why other simple 
protocols such as decode-forward and amplify-forward are not universally-approximate strategies 
for the diamond network. 

Consider an example linear finite-field diamond network shown in Figure \T0\ (a). The cut-set 
upper bound on the capacity of this network is 3 bits/unit time. In a decode-forward scheme, all 
participating relays should be able to decode the message. Therefore the maximum rate of the 
message broadcasted from the source can at most be 2 bits/unit time. Also, if we ignore relay 
A2 and only use the stronger relay, still it is not possible to send information more at a rate 
more than 1 bit/unit time. As a result we cannot achieve the capacity of this network by using 
a decode-forward strategy. 

We next show that this 1-bit gap can be translated into an unbounded gap in the corresponding 
Gaussian network, as shown in Figure [10] (b). By looking at the cut between the destination and 
the rest of the network, it can be seen that for large a, the cut-set upper bound is approximately 

Slog a. (31) 

The achievable rate of the decode-forward strategy is upper bounded by 

Rdf < 2 log a. (32) 

Therefore, as a gets larger, the gap between the achievable rate of decode-forward strategy and 
the cut-set upper bound (|3T| ) increases. 

Let us look at the amplify-forward scheme. Although this scheme does not require all relays 
to decode the entire message, it can be quite sub-optimal if relays inject significant noise into the 
system. We use the deterministic model to intuitively see this effect. In a deterministic network, 
the amplify-forward operation can be simply modeled by shifting bits up and down at each node. 
However, once the bits are shifted up, the newly created LSB's represent the amplified bits of the 
noise and we model them by random bits. Now, consider the example shown in Figure [TOl (a). 
We notice that to achieve a rate of 3 from the source to the destination, the least significant bit 
of the source's signal should go through Ai while the remaining two bits go through A2. Now 
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if A2 is doing amplify-forward, it will have two choices: to either forward the received signal 
without amplifying it, or to amplify the received signal to have three signal levels in magnitude 
and forward it. 

The effective networks under these two strategies are respectively shown in Figure [10] (c) and 
[To! (d). In the first case, since the total rate going through the MAC from Ai and A2 to D is 
less than two, the overall achievable rate cannot exceed two. In the second case, however, the 
inefficiency of amplify-forward strategy comes from the fact that A2 is transmitting pure noise 
on its lowest signal level. As a result, it is corrupting the bit transmitted by Ai and reducing 
the total achievable rate again to two bits/channel use. Therefore, for this channel realization, 
the amplify-forward scheme does not achieve the capacity. This intuition can again be translated 
to the corresponding Gaussian network to show that amplify-and-forward is not a universally- 
approximate strategy for the diamond network. 

C. A four-relay network 

We now look at a more complicated relay network with four relays, as shown in Figure [TT] 
As the first step let us find the optimal relaying strategy for the corresponding linear finite field 
deterministic model. Consider an example of a linear finite field deterministic relay network 
shown in Figure [l2l (a). Now focus on the relaying strategy that is pictorially shown in Figure 
[T3I In this scheme, 

• Source broadcasts b = [61, ... , 65]* 

• Relay Ai decodes ^3, 64, and relay A2 decodes bi, &2 

• Relay Ai and A2 respectively send x^^ = [63, f)4, 65, 0, 0]* and x^ij = [bi, 62, 0, 0, 0]* 

• Relay B2 decodes &i,62,^3 and sends x^a = [61,62,63,0,0]* 

• Relay Bi receives y^^ = [0,0,63,64 © 61,65 © 62]* and forwards the last two equations, 
xbi = [64 ©61, 65 ©62, 0,0,0]* 

• The destination gets y^^ = [61, 62, 63, 64 © 61, 65 © 62]* and is able to decode all five bits. 
This scheme can achieve 5 bits per unit time, clearly the best that one can do since the 

destination only receives 5 bits per unit time. In this optimal scheme the relay Bi is not decoding 
or partially decoding a message; it is forwarding the last two least significant bits. One may 
wonder if this is necessary, or in other words is any choice of partial-decode-and-forward strategy 
suboptimal in this example? To answer this question, note that any partial-decode-and-forward 

June 4, 2010 DRAFT 



21 




(c) (d) 



Fig. 10. An example of the linear finite-field deterministic diamond network is shown in (a). The corresponding Gaussian 
network is shown in (b), with the gains chosen such that the ratio of the gains in dB scale match the ratios of the gains in the 
deterministic network. The effective network when R2 just forwards the received signal is shown in (c). The effective network 
when R2 amplifies the received signal to shift it up one signal level and then forward the message is shown in (d). 
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B2 

(c) 

Fig. 12. An example of a four relay linear finite filed deterministic relay network is shown in (a). The corresponding Gaussian 
relay network is shown in (b). The effective Gaussian network for compress-forward strategy is shown in (c). 




Fig. 13. Demonstration of a capacity achieving strategy. 



June 4, 2010 



DRAFT 



23 



scheme can be visualized as different flows of information going from S to D that do not get 
mixed in the network. Now since all transmitted signal levels of Ai and A2 are interfering with 
each other, it is not possible to get a rate of more than 3 bits/unit time by any partial-decode- 
and-forward scheme in this example and hence it is always suboptimal. 

The last stage in the above scheme can actually be interpreted as a compress-and-forward 
strategy: relays Bi and B2 want to send their 3-bit received vectors to the destination D, but 
because the link from Bi to D only supports 2 bits, the dependency between these received 
vectors must be exploited. However, in the Gaussian network, we cannot implement this strategy 
using a standard compress-and-forward scheme pretending that the two received signals at Bi 
and B2 are jointly Gaussian. They are not. Relay A2 sends nothing on its LSB, allowing the 
MSB of relay Ai to come through and appear as the LSB of the received signal at B2. In fact, 
the statistical correlation between the real-valued received signals at Bi and B2 is quite weak 
since their MSBs are totally independent. Only when one views the received signals as vectors of 
bits, as guided by the deterministic model, the dependency between them becomes apparent. In 
fact, it can be shown that a compress-and-forward strategy assuming jointly Gaussian distributed 
received signals cannot achieve a constant gap to the cut-set bound. 

D. Summary 

We learned two key points from the above examples: 

• All the schemes that achieve capacity of the deterministic networks in the examples forward 
the received bits at the various signal levels. 

• Using the deterministic model as a guide, it is revealed that commonly used schemes 
such as decode-and-forward, partial decode-and-forward, amplify-and-forward and Gaussian 
compress-and-forward can all be very far-away from the cut-set bound. 

We devote the rest of the paper to generalizing the steps we took for the examples. As we 
will show, in the deterministic relay network the optimal strategy for each relay is to simply 
shuffle and linearly combine the received signals at various levels and forward them. This insight 
leads to a natural quantize-map-and-forward strategy for noisy (Gaussian) relay networks. The 
strategy for each relay is to quantize the received signal at the distortion of the noise power. 
This in effect extracts the bits of the received signals above the noise level. These bits are then 
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mapped randomly to a transmit Gaussian codeword. The main result of our paper is to show that 
such a scheme is indeed universally approximate for arbitrary noisy Gaussian relay networks. 

IV. Main Results 

In this section we precisely state the main results of the paper and briefly discuss their 
implications. The capacity of a relay network, C, is defined as the supremum of all achievable 
rates of reliable communication from the source to the destination. Similarly, the multicast 
capacity of relay network is defined as the maximum rate at which the source can send the same 
information simultaneously to all destinations. 

A. Deterministic networks 

1 ) General deterministic relay network: In the general deterministic model the received vector 
signal Yj at node j G V at time t is given by 

y,[t] =g,({xat]},ev), (33) 

where {xj[t]}jgv denotes the transmitted signals at all of the nodes in the network. Note that this 
implies a deterministic multiple access channel for node j and a deterministic broadcast channel 
for the transmitting nodes, so both broadcast and multiple access is allowed in this model. This 
is a generalization of Aref networks Q which only allow broadcast. 
The cut-set bound of a general deterministic relay network is: 

C = max min /(yj^c;x!^|xt^c) (34) 

P{{xj}jev)f^eAD 

= max min i7(yf^e|xnc) (35) 

where = {il : S e il, D e fi'^} is all source-destination cuts. Step (a) follows since we are 
dealing with deterministic networks. 

The following are our main results for arbitrary deterministic networks. 

Theorem 4.1: A rate of 

max min H{yQc\xQc) (36) 
ni6vP{''»)^eA_D 

can be achieved on a deterministic network. 

This theorem easily extends to the multicast case, where we want to simultaneously transmit 
one message from S to all destinations in the set D E V: 
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Theorem 4.2: A multicast rate of 

max min min Hiyr^AxQ^c) (37) 

to all the destinations D E V can be achieved on a deterministic network. 

Note that when we compare (|36l ) to the cut-set upper bound in (|35l) . we see that the difference 
is in the maximizing set, i.e., we are only able to achieve independent (product) distributions 
whereas the cut-set optimization is over any arbitrary distribution. In particular, if the network 
and the deterministic functions are such that the cut-set is optimized by the product distribution, 
then we would have matching upper and lower bounds. This happens for deterministic networks 
with broadcast only, specializing to the result in [|9|l. It also happens when we consider the linear 
finite-field model, whose results are stated next. 

2) Linear finite-field deterministic relay network: Applying the cut-set bound to the linear 
finite-field deterministic relay network defined in Section III-Dl (|22|) . and using (|35l) since we 
have a deterministic network, we get: 

C = max min H(yac\xQc) = min rank(GQn<=) (38) 

where Gn,n= is the transfer matrix associated with the cut ^l, i.e., the matrix relating the vector 
of all the inputs at the nodes in Q to the vector of all the outputs in fi^ induced by (|22|) . This is 
illustrated in Figure [Ml Step (6) follows since in the linear finite-field model all cut values (i.e., 
H(y^c\xnc)) are simultaneously optimized by independent and uniform distribution of {xjjigv 
and the optimum value of each cut Q is logarithm of the size of the range space of the transfer 
matrix Gq^qc associated with that cut. Theorems 14.11 and 14.21 immediately imply that this cutset 
bound is achievable. 

Theorem 4.3: The capacity C of a linear finite-field deterministic relay network is given by 

C = min rank(Gn,f7c). (39) 

Theorem 4.4: The multicast capacity C of a linear finite-field deterministic relay network is 
given by 

C = min min rankfGooc) (40) 
where V is the set of destinations. 
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Fig. 14. Illustration of cut-set bound and cut-set transfer matrix Gf2,n<= . 



Note that the results in Theorems 14.11 14. 2[ 14.31 and 14.41 apply to networks with arbitrary 
topology, possible including cycles. For a single source-destination pair the result in Theorem 
14.31 generalizes the classical max-flow min-cut theorem for wired networks and for multicast, 
the result in Theorem 14.41 generalizes the network coding result in [6|. As we will see in the 
proof, the encoding functions at the relay nodes for the linear finite-field model can be restricted 
to linear functions to obtain the result in Theorem I4.3[ 

B. Gaussian relay networks 

In the Gaussian model each node j G V has Mj transmit and Nj receive antennas. The received 
signal Yj at node j and time t is 

y.W = EH^.XiM + z,-[t] (41) 

where Hij is an Mi x Nj complex matrix whose {k,l) element represents the channel gain 
from the k-th transmit antenna in node i to the l-th receive antenna in node j. Furthermore, 
we assume there is an average power constraint equal to 1 at each transmit antenna. Also Zj, 
representing the channel noise, is modeled as complex Gaussian random vector. The Gaussian 
noises at different receivers are assumed to be independent of each other. 

The following are our main results for Gaussian relay networks; it is proved in Section IVTl 
Theorem 4.5: The capacity C of the Gaussian relay network satisfies 

C -K<C <C, (42) 
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where C is the cut-set upper bound on the capacity of Q as described in Equation (|27] ). and k 
is a constant and is upper bounded by 12 X]l=i + ™'^{X^!=i ^i-> ^!=i ^i}- 

The gap k holds for all values of the channel gains and the result is relevant particularly in 
the high rate regime. It is a stronger result than a degree-of-freedom result, because it is non- 
asymptotic and provides a uniform guarantee to optimality for all channel SNRs. This is the first 
constant-gap approximation of the capacity of Gaussian relay networks. As shown in Section HIH 
the gap between the achievable rate of well known relaying schemes and the cut- set upper bound 
in general depends on the channel parameters and can become arbitrarily large. Analogous to the 
results for deterministic networks, the result in Theorem 14.51 applies to a network with arbitrary 
topology, possibly with cycles. 

The result in Theorem 14 . 5 1 easily extends to the multicast case where we want to simultaneously 
transmit one message from S to all destinations in the set D E V. 

Theorem 4.6: The multicast capacity Cmuit of the Gaussian relay network satisfies 

C'mult ~ 1^ ^ C'mult ^ Cmulti (43) 

where Cmuit is the multicast cut-set upper bound on the capacity of Q given by 

Cmuit = max min min J(yf^e;xf7|xnc), (44) 

and K is a constant and is upper bounded by 12 X]l=i + mm{X]!=i S!=i ^i}- 

The gap k stated in Theorems I4.5H4.6I hold for simple scalar quantization scheme explored 
in detail in Section Section |VIl It is shown in ll20l that a vector quantization scheme even with 
structured lattice codebooks can improve this constant to X]'=i + mm{X]l=i X]!=i ^i} ^ 
2 Yl^^i Ni, which means when all nodes have single antennas, the gap is at most 2|V|. Also, the 
results have been easily extended to the case when there are multiple sources and all destinations 
need to decode all the sources in [12T]| . 



C. Proof program 

In the following sections we formally prove these main results. The main proof program 
consists of first proving Theorem 14.31 and the corresponding multicast result for linear finite- 
field deterministic networks in Section |Vl Since the proof logic of the achievable rate for general 
deterministic networks (l36l [37l) is similar to that for the linear case. Theorems 14.11 and 14.21 are 
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proved in Appendix O We use the proof ideas for the deterministic analysis to obtain the 
universally-approximate capacity characterization for Gaussian relay networks in Section |VIl In 
both cases we illustrate the proof by first going through an example. 

V. Deterministic relay networks 

In this section we characterize the capacity of linear finite-field deterministic relay networks 
and prove Theorems 14.31 and 14.41 

To characterize the capacity of linear finite-field deterministic relay networks, we first focus 
on networks that have a layered structure, i.e., all paths from the source to the destination have 
equal lengths. With this special structure we get a major simplification: a sequence of messages 
can each be encoded into a block of symbols and the blocks do not interact with each other as 
they pass through the relay nodes in the network. The proof of the result for layered network is 
similar in style to the random coding argument in Ahlswede et al. jH. We do this in Section IV-AI 
Next, in Section IV-Bl we extend the result to an arbitrary network by expanding the network 
over timeo. Since the time-expanded network is layered we can apply our result in the first step 
to it and complete the proof. 

A. Layered networks 

The network given in Figure [H] is an example of a layered network where the number of 
hops for each path from to is three. We start by describing the encoding scheme. 

1 ) Encoding for layered linear deterministic relay network: We have a single source S with 
a sequence of messages Wk E {1,2,..., 2™}, k = 1,2,.... Each message is encoded by the 
source S into a signal over T transmission times (symbols), giving an overall transmission rate 
of R. Relay j operates over blocks of time T symbols, and uses a mapping fj : yj — Xj^ on 
its received symbols from the previous block of T symbols to transmitted signals in the next 
block. For the linear deterministic model (|22l) . we use linear mappings /,(■), i.e., 

X, = F,y,-, (45) 

^The concept of time-expanded network is also used in JSJ, but the use there is to handle cycles. Our main use is to handle 
interaction between messages transmitted at different times, an issue that only arises when there is superposition of signals at 
nodes. 
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where the vectors = [xj[l], . . . , Xj[T]]* and y^- = [yj[l], . . . , y^[r]]* respectively represent 
the transmit and received signals over T time units, and the matrix Fj is chosen uniformly 
randomly over all matrices in Wf'^'^'^ . Each relay does the encoding prescribed by (|45] ). Given 
the knowledge of all the encoding functions Fj at the relays, the destination D attempts to 
decode each message Wk sent by the source. This encoding strategy is illustrated in Figure \T5\ 

XAi = F^^y^^ XAa = FazYai 




Fig. 15. Illustration of linear encoding strategy. 

Suppose message Wk is sent by the source in block k. Since each relay j operates only on 
block of lengths T and the network is layered, the signals received at block k at any relay pertain 
to only message lUk^i- where Ij is the path length from source to relay j. 

2) Proof illustration: In order to illustrate the proof ideas of Theorem 14.11 we examine the 
network shown in Figure \T6\ 

Without loss of generality consider the message w = Wi transmitted by the source at block 

= 1. At node j the signals pertaining to this message are received by the relays at block Ij. 
For notational simplicity we will drop the block numbers associated with the transmitted and 
received signals for this analysis. 

Now, since we have a deterministic network, the message w will be mistaken for another 
message w' only if the received signal y£)(tf) under w is the same as that would have been 
received under w'. This leads to a notion of distinguishability: messages w, w' are distinguishable 
at any node j if yj{w) ^ yj{w'). 
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Cannot distinguisi 



S G S 




9 Cannot distinguish 

B2 G 



D GS" 



Transmits same signal under ii;, w' 



Fig. 16. An example of layered relay network. Nodes on the left hand side of the cut can distinguish between messages w 
and w' , while nodes on the right hand side cannot. 

The probability of error at destination D can be upper bounded using the union bound as 



Since channels are deterministic, the randomness is only due to that of the encoder maps. 
Therefore, the probability of this event depends on the probability that we choose such encoder 
maps. Now, we can write 



since the events that correspond to occurrence of the distinguishability sets ^2 G A/? are disjoint. 
Let us examine one term in the summation in (l47l) . For example, consider the cut Vt = {S, Ai, Bi} 
shown in Figure \T6\ A necessary condition for this cut to be the distinguishability set is that 
yA2i'w) = YAaK)' along with YB^iw) = Yb^K) and yoiw) = yz)(u'')- We first define the 
following events: 

Ai = the event that w and w' are undistinguished at node Ai (i.e., y^.(w) = YaS''^'))! i = I 
Bi = the event that w and w' are undistinguished at node Bi (i.e., y5-(u^) = yB-(w^'))i i = I 
V = the event that w and w' are undistinguished at node D (i.e., y£){w) = y£)(w;')). 



Pe < 2^^P{«; ^ w'} = 2^'^P{y^(«;) = y,,(«;')} • 



(46) 



F {w ^ w' } = \ p {Nodes in can distinguish w,w' and nodes in cannot}, (47) 
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We now have 



V = r{A2,B2,V,Al,Bl} 

= F{A2} X ¥{B2,Al\A2} X P{D, ^2,^?} 

< P{^2} X F{B2\A'i,A2} X FiVlBl, A2, B2, A^} 

= F{A2}x¥{B2\Al,A2}x¥{V\Bl,B2} 



(48) 
(49) 
(50) 
(51) 



where the last step is true since there is an independent random mapping at each node and 
we have the following Markov structure in the network 



Xs -> (1a,,1aJ -> {Yb,,Yb,) -> Yd. 



(52) 



As the source does a random linear mapping of the message onto xs{w), the probability of 



A2 is 



,} = P{(It ® Gs,A,){M^) - ^w')) = 0} = 2-™nk(Gs,.,)^ 



(53) 



because the random mapping given in ([45 1 induces independent uniformly distributed xs{w),xs{w') 
Here, ® is the Kronecker matrix producij. Now, in order to analyze the second probability, we 
see that A2 implies x^^l^) = ^^2(^0' the same signal is sent under both w,w'. Also 
if YA^iw) ^ YaiI"*^')' then the random mapping given in (l45l) induces independent uniformly 
distributed x^^ (w) , x^^ {w') Therefore, we get 

F{B2\Al,A2} = P{(Ir ® GA„B,)i^MH - x^, («;')) = 0} = 2-™nk(G,,.,,)_ 



Similarly, we get 

ni^\BlB2} = P{(It ® Gb,,d){^b,H - ^bA^')) = 0} = 2-™nk(C,)_ 

Putting these together we see that in (l47l) . for the network in Figure [HI we have, 

P < 2-^1^^(03,^2 )2-™nk(GAi,fl2)2-™nk(GB^,o) 
^ 2-^ir^'^k(Gs,A2)+rank(GAi,B2)+rank(GB^,D)}_ 



(54) 



(55) 



(56) 



^If A is an m-by-n matrix and 5 is a p-by-q matrix, then the Kronecker product A ® B is the mp-by-nq block matrix 

dwB ■ ■ ■ (i\riB 



a-miB ■ ■ ■ amnB 
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Note that since 





Gs,A2 













GAi,B2 













Gbi,d 



the upper bound for V in (|56l ) is exactly 2 ^rank{Gn,nc) Therefore, by substituting this back 
into (07]) and (061), we get 

which can be made as small as desired if i? < min^gA^ rank(Gn,f7c), which is the result claimed 
in Theorem 14.31 

3) Proof of Theorems \4.3\ and \4.4\ for general layered networks: Consider the message = Wi 
transmitted by the source at block k = 1. The message w will be mistaken for another message 
w' only if the received signal Yd^w) under w is the same as that would have been received 
under w'. Hence the probability of error at destination D can be upper bounded by, 

Pe < 2^^P{«; ^ w'} = 2^^P{y^(«;) = y^(«;')} • (58) 

Similar to Section IV-A2[ we can write 

F^w ^ w'} = y P {Nodes in n can distinguish w, w' and nodes in cannot} (59) 

^ ■' 

For any such cut ^l, define the following sets: 

• Li{Q): the nodes that are in and are at layer / (for example S E Li{fl)), 

• Ri{Vl): the nodes that are in ^l^ and are at layer I (for example D E i?/^ (!])). 
We now define the following events: 

• Ci: Event that the nodes in Li can distinguish between w and w', i.e., YL^iw) ^ yj^^{w'), 

• IZi'. Event that the nodes in Ri cannot distinguish between w and w' , i.e., y^^ [w) = y^^ {w'). 
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Similar to Section IV-A21 we can write 

V = P{7^^£;_l,^ = 2,...,/B} (60) 

Id 

= J]P{7^^,A_l|7^„£,_l,J = 2,...,/-!} (61) 



1=2 
Id 



< \[nni\nj,Cj,j = 2,...,i-i} (62) 



(a) 



1=2 
Id 



l[¥{ni\ni_i,Ci-i} (63) 

1=2 

where (a) is true due to the Markovian nature of the layered network. Note that as in the example, 
all nodes in Ri^i transmit the same signal under both w and w' (i.e., ^j{w) = Xj{w'), \/j G Ri^i). 
Therefore, just as in Section IV-A21 we see that i.e., 

p{7^,|7^,_l,/:,„l} = p{y«,H = y^,K)|yL,_,H yL,_,K),yR,_,H = yl^^.K)} (64) 

= ^{yri (^) = yR, K) IYl^.i (w) ^ y^,_^ {w') , x^,_^ (w) = xl^_^ {w') | (65) 

= P { (It ® G^,_,,^J (x^,_, (w) - x^,_, (^0) = 0|y^,_, (w) ^ y^^_^ (w') }(66) 
(a) 2-T'rank(G^,_^,«,)_ ^^^^ 

where Gli_-^,Ri is the transfer matrix from transmitted signals in Li^i to the received signals in 
Ri. Step (a) is true since yLi_^{w) 7^ yL;_i(^') ^^'^ hence the random mapping given in (l45l) 
induces independent uniformly distributed :}i-Lj_^{w),:x.^_^{w'). 
Therefore we get 

d 

-p < Jj2"™'^^^^^'-i'«'^ = 2^™'^k(G"."'=). (68) 

1=2 

By substituting this back into (l59l) and (l58l) . we see that 

Pe < 2'^"^|A£)|2~"^™^'^"^^-D (69) 

which can be made as small as desired if i? < min^gA^ rank(Gn,nO' which is the result claimed 
in Theorem 14.31 for layered networks. 

To prove Theorem 14.41 for layered networks, we note that for any destination D E V, the 
probability of error expression in (l69l) holds. Therefore, if all receivers in V have to be able to 
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A 




B 



(a) An example of general determin- 



istic network 




OO OO CX) OO OO OO D 



(b) Unfolded deterministic network. An example of steady cuts and dipping cuts are respectively shown by solid 
and dotted lines. 

Fig. 17. An example of a general deterministic network with un equal paths from S to D is shown in (a). The corresponding 
unfolded network is shown in (6). 



decode the message, then an error occurs if any of them fails to decode. Therefore, using the 
union bound and (l69l) we can bound this error probability as, 

Dev 

which clearly goes to zero as long as i? < minDe© minneAo rank(Go,nO' which is the result 
claimed in Theorem 14.41 for layered networks. 

Therefore, we have proved a special case of Theorem 14.41 for layered networks. 

B. Arbitrary networks (not necessarily layered) 
First we formally describe the encoding strategy: 
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1 ) Encoding for arbitrary linear finite-field relay networks: We have a single source S with 
message w G {1, 2, . . . , 2^^^} which is encoded by the source S into a signal over KT 
transmission times (symbols), giving an overall transmission rate of R. Relay j operates over 
blocks of time T symbols, and at the /c-th block uses a linear mapping ff\.) to map its received 
symbols from all the previous k — \ blocks of T symbols to transmitted signals in the next block, 
i.e., 

(k) nTx(k—l)nT 

where is chosen uniformly randomly over all matrices in . Each relay does the 

encoding prescribed by (TtTI) . Given the knowledge of all the encoding functions F^'^' at the 
relays, the destination D E V attempts to decode the message lu sent by the source. 

Given the proof for layered networks with equal path lengths, we are ready to tackle the proof 
of Theorem 14.31 and Theorem l4.4l for general relay networks. The ingredients are developed below. 
First, we can explicitly represent our relaying scheme by unfolding the network over time to 
create a layered network. The idea is to unfold the network to K stages such that i-th stage 
represents what happens in the network during (i — 1)T to zT — 1 symbol times. For example 
in Figure [17] (a) a network with unequal paths from 5* to D is shown. Figure \T7\ h) shows the 
unfolded form of this network. Each node v eV appears at stage 1 < z < A' as v[i]. There are 
additional nodes: T[i]'s and i?[z]'s. These nodes are virtual transmitters and receivers that are 
put to buffer and synchronize the network. Since all the communication links connected to these 
nodes (^[zj's and i?[i]'s) are modelled as wired links without any capacity limit, they do not 
impose any constraint on the network. Note that there are also infinite-capacity links between 
nodes such as ^[1], ^[2], . . ., that are copies of the same node at different blocks. 

Lemma 5.1: Assume ^ is a linear finite-field network and Q^^^^ is a network obtained by 
unfolding Q over K time steps (as shown in Figure [TTI) . Then a communication rate of 

R < — min rank (Go , oc ) (72) 

is achievable in Q, where the minimum is taken over all cuts f2unf in ^l^'*- 

Proof: By unfolding Q we get an acyclic layered finite-field network. Therefore by our 
result of Section IV-A3[ we can achieve the rate 

/?unf<^miri rank(Go„„f,ns„f) (73) 
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in the time-expanded graph. Since it takes K steps to translate an achievable scheme in the 
time-expanded graph to an achievable scheme in the original graph, the Lemma is proved. ■ 

The achievability scheme we used to prove Lemma [STI was obtained by applying the encoding 
scheme described in section IV-Al I to the network that is unfolded over K blocks. This translates 
to the encoding scheme defined in Section IV-Bll for a general linear finite-field relay network. 

If we look at different cuts in the time-expanded graph we notice that there are two types 
of cuts with finite value. One type separates the nodes at the different stages identically. An 
example of such a "steady" cut is drawn with solid line in Figure [17] (b) which separates {S, A} 
from {B, D} at all stages. Clearly each steady cut in the time-expanded graph corresponds to a 
cut in the original graph and moreover its value is K times the value of the corresponding cut in 
the original network. However there is another type of cut. An example of such a "dipping" cut 
is drawn with dotted line in Figure [17] (b). A dipping cut can be thought of as a list of steady 
cuts with a number of downward transitions between them. Note that if a cut has an upward 
transition, then one of the infinite-capacity links will cross the cut from source to destination; 
hence the cut-value becomes infinity. So we only need to consider the cuts with downward 
transitions. However, since the number of downward transitions is at most |V|, the value of a 
dipping cut is at least equal to K — \V\ times the min-cut value of in the original network. 
Therefore 

min rank(Gn„„f,n;;„f) > (i^ - |V|) min rank(Gn,f20. (74) 
This combined with Lemma [5TT] completes the proof of Theorem 14. 3f . 

VL Gaussian relay networks 

So far, we have focused on deterministic relay networks. As we illustrated in Sections [H] 
and Uni linear finite-field deterministic model captures some (but not all) aspects of the high 
SNR behavior of the Gaussian model. Therefore we have some hope to be able to translate the 
intuition and the techniques used in the deterministic analysis to obtain approximate results for 
Gaussian relay networks. This is what we will accomplish in this section. 

''An alternate proof of the same result was given in 1221 . In that proof, only the previous received block was used by the 
relays, instead of the larger number of blocks used above. However, we needed to use the sub-modularity properties of entropy 
to demonstrate the performance of that scheme 1221 . 
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Theorem 14.51 is the main result for Gaussian relay networks and this section is devoted to 
proving it. The proof of the result for layered network is done in Section IVI-A[ We extend the 
result to an arbitrary network by expanding the network over time, as done in Section |Vl We 
first prove the theorem for the single antenna case, then at the end we extend it to the multiple 
antenna scenario. 

A. Layered Gaussian relay networks 

In this section we prove Theorem 14.51 for the special case of layered networks, where all paths 
from the source to the destination in Q have equal length. 

1) Proof illustration: Our proof has two steps. In the first step we propose a relaying strategy, 
which is similar to our strategy for deterministic networks, and show that by operating over a 
large block, it is possible to achieve an end-to-end mutual information which is within a constant 
gap to the cut-set upper bound. Therefore, the relaying strategy together with the whole network 
creates an inner code which provides certain mutual information between the source and the 
destination. Each symbol of this inner code is a block. In the next step, we use an outer code to 
map the message to multiple inner code symbols and send them to the destination. By coding 
over many such symbols, it is possible to achieve a reliable communication rate arbitrarily close 
to the mutual information of the inner code, and hence the proof is complete. The system diagram 
of our coding strategy is illustrated in Figure [TSl 

We now explicitly describe our encoding strategy 

2} Encoding for layered Gaussian relay networks: We first define a quantization operation. 

Definition 6.1: The quantization operation [.] : C — Z x Z maps a complex number c = x + iy 
to [c] = ([x], [y]), where [x] and [y] are the closest integers to x and y, respectively. Since 
the Gaussian noise at all receive antennas has variance 1, this operation is basically scalar 
quantization at noise-level. 

As shown in Figure [TSl the encoding consists of an inner code and an outer code: 

a) Inner code: Each symbol of the inner code is represented by w G {1, . . . , 2^'"^}, where 
T and Ri^ are respectively the block length and the rate of the inner code. The source node S 
generates a set of 2^'"^ independent complex Gaussian codewords of length T with components 
distributed as i.i.d. CJ\f{0, 1), denoted by 71-^. At relay node i, there is also a random mapping 
Fi : (Z^, Z^) Tx, which maps each quantized received signal vector of length T independently 
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into an i.i.d. CAf{0, 1) random vector of length T. A particular realization of Fi is denoted by 
fi. Summarizing: 

• Source: maps each inner code symbol n G {1, . . . , 2^'"^} to Fs{u) G Tx^. 

• Relay i: receives of length T. Quantizes it to [yj. Then maps it to i^j([y,J) G 7^^. 

b) Outer code: The message is encoded by the source into inner code symbols, ui^ . . . ,un- 
Each inner code symbol is then sent via the inner code over T transmission times, giving an 
overall transmission rate of R. The received signal at the destination, corresponding to inner 
code symbol ui, is denoted by y^, j, i = 1, . . . ,N . 

Now, given the knowledge of all the encoding functions Fj's at the relays and quantized 
received signals [y^, i], • • • , [yd at], the destination attempts to decode the message sent by the 
source. 

3) Proof of Theorem \4.5\ for layered networks: Our first goal is to lower bound the average 
end-to-end mutual information, averaged over the random mappings Fy = {Fi : i G V}, achieved 
by the inner code defined in Subsection IVI-A2I 
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Note that 

i/(^.;[y^]|Fv) > ^/(«;[y^]|zv,Fv)-^iJ([y^]|«,Fv) (75) 

where zy is the vector of the channel noises at all nodes in the network. The first term on the 
right hand side of (l75l) is the average end-to-end mutual information conditioned on the noise 
vector. Once we condition on a noise vector, the network turns into a deterministic network. We 
use an analysis technique similar to the one we used for linear deterministic relay networks to 
upper bound the probability that the destination will confuse an inner code symbol with another 
and then use Fano's inequality to lower bound the end-to-end mutual information. This is done 
in Lemma 16.31 The second term on the RHS of (l75l) is the average entropy of the received 
signal conditioned on the source's transmit signal, and is upper bounded in Lemma 16. 5[ This 
term represents roughly the penalty due to noise-forwarding at the relay, and is proportional to 
the number of relay nodes. 
Definition 6.2: We define 

Ci.i.d. = mml{xn; ynA^U'^) (76) 

where Xj, i eV, are i.i.d. CJ\f{0, 1) random variables. 

Lemma 6.3: Assume all nodes perform the operation described in subsection IVLA2I (a) and 
the inner code symbol U is distributed uniformly over {1, . . . , 2^'"^}. Then 

I{u; [yz)]|zv, Fv) > R;,T - (1 + min{l, 2\^\2-ncu,-mogm.., IV'l)-^.n)}i?;„T) (77) 

where Cud is defined in Definition 16. 2[ 
Proof: 

Consider a fixed noise realization in the network zy = a. Suppose the destination attempts to 
detect the transmitted symbol u at the source given the received signal, all the mappings, channel 
gains, and a. A symbol value u will be mistaken for another value u' only if the received signal 
[y£,(n)] under u is the same as what would have been received under u'. This leads to a notion 
of distinguishability for a fixed a, which is that symbol values n, u' are distinguishable at any 
node j if [yj{u)] 7^ [y^ («')]• Hence, 

P {w ^ m'|zv = a} = > P {Nodes in Vt can distinguish u' and nodes in Qf^ cannot|zv = a} 

^ — ' ^ ^ ^ 

For any cut G A/?, define the following sets: 
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• L;(f2): the nodes that are in Q and are at layer / (for example S E Li(f2)), 

• the nodes that are in Q'' and are at layer / (for example D 6 i?;^ 
We also define the following events: 

• Cf. Event that the nodes in Li can distinguish between u and u', i.e., [yLiiu)] 7^ [Yl,!^')]' 

• TZf. Event that the nodes in Ri can not distinguish between u and u', i.e., [yR^iu)] = 

Note that the source node by definition distinguishes between the two distinct messages u, u' , 
i.e. P{£i} = 1. 

V = P{7^^,A_l,/ = 2,...,/,5|zv = a} (79) 

Id 

= J]P{7^^,/:n|7^„/:,_l,i = 2,...,/-l,zv = a} (80) 

1=2 
Id 

< J]P{7^z|7^J,£„J = 2,...,/-l,zv = a} (81) 

1=2 

Id 

J]P{7^^|7^^_l,£^„l,zv = a} (82) 



1=2 

Id 



U^ilynM = [yij,K)]l^^-i, A-i,zv = a} (83) 



1=2 



where (a) is true due to the Markov structure in the layered network. 
Note that if A and B are complex m x n matrices, then 



[Aij] = [Bi,,-], V2, J ^ ||A - B|U < V2n. (84) 



Therefore by (1831) and (1841) we have 

Id 

V < lln\\yR,iu)-yR^{u')\\^<V2\ni^^,Ci^^,zy = ci} (85) 



1=2 
Id 



lln\\yRM)-yRi(^')\\o^ < V2|^«-i,>c,_i} (86) 



(a) 

1=2 

where (a) is true since conditioned on lZi^i,Ci-i the distribution of yii^{u) — yii^{u') does not 
depend on the noise (due to the random mapping). 
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By defining to be the transfer matrix from tlie left side of the cut at stage / — 1 to the 
right side of the cut at stage / (i.e., the MIMO channel from Li_i to Ri), we have 



-P < l[n\\yR,{^^)-yRSu')\\oo<V2\ni.,,Ci-i} (87) 

1=2 
Id 

= Hn^l < J < T : ||H, (xi,_„,H - x^,_,,,K)) Hoc < v^|7^,_l,£,_l} (88) 

1=2 
Id 

J]P{V1 < i < T : ||H, (xi,_„,H - ^L,_,A^')) Hoc < (89) 



(J 

1=2 

where (b) is true since the nodes in _R;_i(r2) transmit the same codeword under both u and u'. 

Now since x^_^(u) ^ x^_^{u'), due to the random mapping they are independent and the 
difference is just an i.i.d. complex Gaussian random vector whose components have distribution 
CA/'(0, 2). Now, we state the following Lemma which is proved in Appendix iDl 

Lemma 6.4: Assume [xj i, ■ ■ ■ , r], i = 1, . . . , m, are i.i.d. vectors of length T with i.i.d. 
CAf{0, 2) elements, and H G C""""" is an n x m matrix. Then 

P{V1 < J < T : ||H[5:i,,-, ■ ■ ■ ,x^jf lU < < 2-^(^(^'**^+^)-'"*'^('"'")) (90) 

where x and z are i.i.d. complex unit variance Gaussian vectors of length m and n respectively. 
By applying Lemma 16.41 to (|89l ) we get 

P{V1 < J < T : ||H, {xL,_,j{u) - XL,_,A^')) Hoc < V2\Ci.,} < 2-^(^K-i^yHj^«^-i)-^-(l^'- 

(91) 

where Xi, i eV, are i.i.d. with Gaussian distribution. Hence 

p < J^2^^(^(^^'-i'^«i'^'''-i)""'^''(l-^'-il'l^'lO < 2-^(^"''-l^l) (92) 

1=2 

where Cud is defined in Definition 16. 2[ 

The average probability of symbol detection error at the destination can be upper bounded as 

= P ^ u\zv = a} < 2^-^P {u -> m'|zv = a} . (93) 

By the union bound we have 

Pe<^ 2^^(^"'*"l^l~-^) < 2l^l2"^(^"''"l^l~^). (94) 
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Now, using Fano's inequality we get 



= 


H{u) - 


- H{u\[yj^],z.y = a, Fv) 


(95) 


= 


RinT — 


Hiu\[yj;)],zy = a,Fv) 


(96) 




RinT — 


E^^[i/(n|[y^],zv = a,Fv = /v)] 


(97) 


Fano 
> 


RmT — 


{1+Ef^[F{u ^ n|zv = a,Fv = /v)}]i?inT) 


(98) 




RinT — 


(1 + Pei?inT) 


(99) 


> 


RmT — 


(1 + min{l, 2l^l2-^(^'«-l^l)-^-)}i?i„T) 


(100) 
■ 



Hence, the proof is complete. 

The following lemma, which is proved in Appendix El bounds the second term on the RHS 
of (175]). 

Lemma 6.5: Assume all nodes perform the operation described in subsection IVI-A2I (a). Then 

H{[yo]\u,Fv)] < 12T\V\ (101) 

The next lemma, which is proved in Appendix IB bounds the gap between C and Cad- 
Lemma 6.6: For a Gaussian relay network Q, 

C-C,,d<\V\ (102) 



where C is the cut-set upper bound on the capacity of Q and Cad is defined in Definition 1 
Finally, using Lemmas 16. 3[ 16.51 and 16. 6[ we have 

Lemma 6.7: Assume all nodes perform the operation described in subsection IVI-A2I (a) and 
the inner code symbol U is distributed uniformly over {1, . . . , 2^"^}. Then 

[y^]|Fv) > - 12|V| - (i + min{l, 2l^l2-^(^^^l-^-)}i?,„) (103) 

where C is the cut-set upper bound on the capacity of Q. 

Proof: By using Equation (l75l) and Lemmas 16.31 16.51 and 16. 6[ we have 

ijK[y^]|Fv) > l/K[y^]|zv,Fv)-^//([y^]K,Fv) (104) 

Lemma lO] and [63] 1 / , , ^ . \ 

> 1. (^R.^T - (1 + min{l, 2l^l2-^(^»»''-l^l)-^-)}i?i„T) - 12T|V| j (105) 

Lemmam^^^ - 12|V| " + min{l, 2l^l2^^(^"l^|-^-) }/?;„) . (106) 
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An immediate corollary of this lemma is that by choosing Ri^ arbitrarily close to C — \V\, 
and letting T be arbitrary large, for any 5 > we get 

ijK[y^]|Fv)>C-13|V|-5. (107) 

Therefore, there exists a choice of mappings that provides an end-to-end mutual information 
close to C — 13|V|. Hence, we have created a point-to-point channel from u to [y^j] with at least 
this mutual information. We can now use a good outer code to reliably send a message over 
uses of this channel (as illustrated in Figure [TSl) at any rate up to C — 13|V|. 

Hence we get an intermediate proof of Theorem 14.51 for the special case of layered Gaussian 
relay networks, with single antennas in the network. This is stated below for convenience, and 
its generalization to arbitrary networks with multiple antennas is given in Section IVI-BI 

Theorem 6.8: Given a Gaussian relay network Q with a layered structure and single antenna 
at each node, all rates R satisfying the following condition are achievable, 

R<C- /tLay (108) 

where C is the cut-set upper bound on the capacity of Q as described in Equation (|27]) . ^Lay = 
13|V| is a constant not depending on the channel gains. 

4) Vector quantization and network operation: The network operation can easily be gen- 
eralized to include vector quantization at each node. Each node in the network generates a 
transmission Gaussian codebook of length T with components distributed as i.i.d. CJ\f{0, 1). The 
source operation is as before, it produces a random mapping from messages w E {1, . . . , 2^^} 
to its transmit codebook Tx^- We denote this codebook by x^"''', w E {1, . . . , 2™}. Each received 
sequence at node i is quantized to y^ through a Gaussian vector quantizer, with quadratic 
distortion set to the noise-level. This quantized sequence is randomly mapped onto a transmit 
sequence Xj using a random function Xj = /i(yj). This mapping as before is chosen such that 
each quantized sequence is mapped uniformly at random to a transmit sequence. These transmit 
sequences are chosen to be in 71-, which are i.i.d. Gaussian CA/'(0, 1). We denote the 2^^' 
sequences of y^ as y-^' \ ki G {1, . . . , 2^^' }. Standard rate-distortion theory tells us that we need 
Ri > I{Yi] Yi) for this quantization to be successful, where the reconstruction is chosen such 
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that the quadratic distortion is at the noise-leveH Since the uniform random mapping produces 
= /j(yi)' for ^ quantized value of index ki, we will denote it by y-^'-* and the sequence it is 
mapped to by x.^^^'^ = fiiyf'^). At the destination, we can either employ a maximum-likelihood 
decoder (for which the mutual information is evaluated), or a typicality decoder (see ||20| for 
more details). 

B. General Gaussian relay networks (not necessarily layered) 

Given the proof for layered networks, we are ready to tackle the proof of Theorem 14.51 for 
general Gaussian relay networks. 

First we formally describe the encoding strategy: 

1 ) Encoding for general Gaussian relay network: We have a single source 5* with a sequence 

of messages wj E {1,2,..., 2^^^^^}, j = 1,2, Each message is encoded by the source S 

into a signal over NKT transmission times (symbols), giving an overall transmission rate of R. 
Similar to the layered case, we have an inner code and an outer code. 

a) Inner code: 

• Source: maps each inner code symbol u E {1, . . . , 2^^'"^} to Fs{u) E Txg {\Txs\ = 2^'"^) 
and sends it in KT transmission times. 

• Relay i: operates over blocks of time T symbols, and at the k-th block quantizes all 
received sequence (yf . . ., y-'^^) into ([y-^^], • • •, [yl'^^]), which is then randomly mapped 

toFf)(([yf^],...,[y?^]))er., 

b) Outer code: Source S has a sequence of messages Wj E {1,2,..., 2^^-^^}, j = 1, 2, . . .. 
Each message is encoded by the source into inner code symbols of size 2^^^, ui, . . . ,un- 
Each inner code symbol is then sent via the inner code over KT transmission times (symbols). 

Given the knowledge of all the encoding functions at the relays and signals received over 
K — |\^| blocks, the destination D attempts to decode the message w sent by the source. 

Similar to the deterministic case (Section IV-BI) . we use the time expansion idea to analyze 
this relaying scheme. We first state the following lemma which is a corollary of Theorem 16. 8[ 

^Note that we can be conserative and assume the maximal received power, depending on the maximal channel gains. Since 
we do not directly convey this quantization index, but just map it forward, this conservative quantization suffices. 
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Lemma 6.9: All rates R satisfying the following condition are achievable: 

R < ^C^^f - (109) 

where C^^^ is the cut-set upper bound on the capacity of the time expanded graph associated 
with Q, and k = 13|V|. 

Proof: By unfolding Q we get an acyclic network such that all the paths from the source 
to the destination have equal length. Therefore, by Theorem I6.8[ all rates /?unf5 satisfying the 
following condition are achievable in the time-expanded graph: 

(110) 

where n^^f = 13K|V|. But the number of nodes (not including the virtual sources and destina- 
tions) at each layer of the unfolded graph is exactly |V|, hence ^unf = 13f^|V|. Now since it 
takes K steps to translate an achievable scheme in the time-expanded graph to an achievable 
scheme in the original graph we can achieve -^i?unf and the proof is complete. ■ 
Similar to the deterministic case, it is easy to see that 

ciff^ = (K-|v|)a (111) 

Hence, by Lemma 16.91 and (|111|) . we can achieve all rates up to 

K - IVI — 

R< -^C-K (112) 

K 

where k = 13|V|. By letting A' — t- oo the proof of Theorem 14.51 is complete. 

To prove Theorem 14. 6 [ i.e., the multicast scenario, we just need to note that if all relays will 
perform exactly the same strategy then by our theorem, each destination, D E V, will be able 
to decode the message with low error probability as long as the rate of the message satisfies 

R < min Cu.d.^D - n' (113) 

where k' < 12|V| is a constant and as in Definition 16.21 we have Ci,i,d..D = min^gAc l^S |I + 
PGqG^I is the cut-set bound evaluated for i.i.d. input distributions. Therefore as long as i? < 
Cmuit — where k < 13|V|, all destinations can decode the message and hence the theorem is 
proved. 

In the case that we have multiple antennas at each node, the achievability strategy remains the 
same, except now each node receives a vector of observations from different antennas. We first 
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quantize the received signal of each antenna at the noise level and then map it to another transmit 
codeword, which is joint across all antennas. The error probability analysis is exactly the same 
as before. However, the gap between the achievable rate and the cut-set bound will be larger. We 
can upper bound the gap between C and Cad by the maximum number of degrees of freedom 
of the cuts, which due to (12591) is at most niin{^[^'j^ Afj, ^[^'^^ A^j} (see the last paragraph in 
Appendix 10). Also, by treating each receive antenna as a separate node and applying Lemma 
[631 we get that H{[yD\\u, Fy)] < l^TjjiX Ni. Therefore, from our previous analysis we know 
that the gap is at most 12 Yl^^i + ™i^{I^l=i I^l=i ^i} ^^'^ the theorem is proved when 
we have multiple antennas at each node. 



In Section II, we showed that while the linear finite-field channel model captures certain high 
SNR behaviors of the Gaussian model, it does not capture all aspects. In particular, its capacity is 
not within a constant gap to the Gaussian capacity for all MIMO channels. A natural question is: 
is there a deterministic channel model which approximates the Gaussian relay network capacity 
to within a constant gap? 

The proof of the approximation theorem for the Gaussian network capacity in the previous 
section already provides a partial answer to this question. We showed that, after quantizing all 
the output at the relays as well as the destination, the end-to-end mutual information achieved 
by the relaying strategy in the noisy network is close to that achieved when the noise sequences 
are known at the destination, uniform over all realizations of the noise sequences. In particular, 
this holds true when the noise sequences are all zero. Since the former has been proved to be 
close to the capacity of the Gaussian network, this implies that the capacity of the quantized 
deterministic model with 



.iev 

must be at least within a constant gap to the capacity of the Gaussian network. It is not too 
difficult to show that the deterministic model capacity cannot be much larger. We establish 
all this more formally in the next section, where we call the model in (11141) as the truncated 
deterministic model. 



VII. Connections between models 




V 



(114) 
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A. Connection between the truncated deterministic model and the Gaussian model 

Theorem 7.1: The capacity of any Gaussian relay network, Coaussian, and the capacity of the 
corresponding truncated deterministic model, CTmncated^ satisfy the following relationship: 

IC'oaussian C'Truncatedl ^ 32|V|. (115) 

To prove this theorem we need the following lemma which is proved in Appendix [Gl 
Lemma 7.2: Let G be the channel gains matrix of a m x rz, MIMO system. Assume that there 
is an average power constraint equal to one at each node. Then for any input distribution Px> 

|J(x;Gx + Z) -/(x; [Gx])| < 19n (116) 

where Z = [zi, . . . , z„] is a vector of n i.i.d. CA/'(0, 1) random variables. 

Proof: (proof of Theorem 17.11 ) 
First note that the value of any cut in the network is the same as the mutual information of a 
MIMO system. Therefore from Lemma 17.21 we have 

jC'caussian ~ Cxmncatedl ^19|V|. (117) 

Now pick i.i.d. normal CA/'(0, 1) distribution for {xjjjgv By applying Theorem 14.11 to the 
truncated deterministic relay network, we find 

C^Truncated > M^^'^) = H{y~''^M, (118) 

where (a) is because we have a deterministic network. By Lemma [6^ and Lemma U?2\ we have 
mmI{yfr^^'^'';xnM > I{y^~ xnM-l9\V\ (119) 

> CGaussian-20|V|. (120) 

Then from Equations (II 171) and (11201) we have 

Coaussian — 20|V| < Cxmncated < C'oaussian + 19|V|. (121) 

Also from Theorem 14.51 we know that 

C*Gaussian 13|V| ^ C'caussian ^ C'caussian- (122) 

Therefore 

I C'oaussian ~ C*Truncated| ^ 32|V|. (123) 

■ 

June 4, 2010 DRAFT 



48 



VIII. Extensions 

In this section we extend our main result for Gaussian relay networks (Theorem 14.51 ) to the 
following scenarios: 

1) Compound relay network 

2) Frequency selective relay network 

3) Half-duplex relay network 

4) Quasi-static fading relay network (underspread regime) 

5) Low rate capacity approximation of Gaussian relay network 

A. Compound relay network 

The relaying strategy we proposed for general Gaussian relay networks does not require any 
channel information at the relays; relays just quantize at noise level and forward through a 
random mapping. The approximation gap also does not depend on the channel gain values. As a 
result our main result for Gaussian relay networks (Theorem 14.51) can be extended to compound 
relay networks where we allow each channel gain hij to be from a set Tiij, and the particular 
chosen values are unknown to the source node S, the relays, and the destination node D. A 
communication rate R is achievable if there exists a scheme such that for any channel gain 
realizations, the source can communicate to the destination at rate R. 

Theorem 8.1: The capacity Ccn of the compound Gaussian relay network satisfies 

Ccn " ^^ ^ Ccn ^ Ccn, (124) 

where Ccn is the cut- set upper bound on the compound capacity of Q, i.e. 

Ccn= max inf min /(y^^c; x^Ixqc), (125) 

p{{x,}j6v)^G-HneAB 

and K is a constant and is upper bounded by 13 ^Si + minj^J^I, M^, ^J^', A^^}, where 
and Ni are respectively the number of transmit and receive antennas at node i. 
Proof outline: We sketch the proof for the case that nodes have single antenna; its extension to 
the multiple antenna scenario is straightforward. As we mentioned earlier, the relaying strategy 
that we used in Theorem 14.51 does not require any channel information. However, if all channel 
gains are known at the final destination, all rates within a constant gap to the cut-set upper 
bound are achievable. We first evaluate how much we lose if the final destination only knows a 
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quantized version of the channel gains. In particular assume that each channel gain is bounded 
\hij\ E [/imiin Kaax], ^ind final destination only knows the channel gain values quantized at level 
— , where rfmax is the maximum degree of nodes in Q. Then since there is a transmit power 
constraint equal to one at each node, the effect of this channel uncertainty can be mimicked by 
adding a Gaussian noise of variance (imax x (-^ — 1 = 1 at each relay node (i.e., doubling the 
noise variance at each node), which will result in a reduction of at most |V| bits from the cut-set 
upper bound. Therefore with access to only quantized channel gains, we will lose at most |V| 
more bits, which means the gap between the achievable rate and the cut-set bound is at most 
14|V|. 

Furthermore, as shown in ll23l there exists a universal decoder for this finite set of channel 
sets. Hence we can use this decoder at the final destination and decode the message as if we 
knew the channel gains quantized at the noise level, for all rates up to 

R< max inf min /(yj^c; x^lx^c) (126) 
p{{^i}jsv) hen ^<^^D 

where "H is representing the quantized state space. Now as we showed earlier, if we restrict the 
channels to be quantized at noise level the cut-set upper bound changes at most by |V|, therefore 

Ccn-\V\< max inf min I{yQc; xqIxqc). (127) 
p({^i}jGv) hen f^eAo 

Therefore from Equations (11261) and (11271) all rates up to Ccn — 14|V| are achievable and the 
proof can be completed. 

Now by using the ideas in [|24ll and ll25l . we believe that an infinite state universal decoder 
can also be analysed to give "completely oblivious to channel" results. ■ 



B. Frequency selective Gaussian relay network 

In this section we generalize our main result to the case that the channels are frequency 
selective. Since one can present a frequency selective channel as a MEMO link, where each 
antenna is operating at a different frequency ban(]^ this extension is just a straightforward 
corollary of the case that nodes have multiple antennas. 

*This can be implemented in particular by using OFDM and appropriate spectrum shaping or allocation. 
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Theorem 8.2: The capacity C of the frequency selective Gaussian relay network with F 
different frequency bands satisfies 

C-K<C<C (128) 

where C is the cut-set upper bound on the capacity of Q as described in Equation (|27|) . and k is 
a constant and is upper bounded by 12F^I^I, A^, + Fminj^^l, M^, ^[^l, N,}, where and 
A^j are respectively the number of transmit and receive antennas at node i. 



C. Half duplex relay network (fixed transmission scheduling) 

One of the practical constraints on wireless networks is that the transceivers cannot transmit 
and receive at the same time on the same frequency band, known as the half-duplex constraint. As 
a result of this constraint, the achievable rate of the network will in general be lower. The model 
that we use to study this problem is the same as [26]. In this model the network has finite modes 
of operation. Each mode of operation (or state of the network), denoted by m G {1, 2, . . . , M}, 
is defined as a valid partitioning of the nodes of the network into two sets of "sender" nodes 
and "receiver" nodes such that there is no active link that arrives at a sender node[]. For each 
node i, the transmit and the receive signal at mode m are respectively shown by x"' and ?/™. 
Also tm defines the fraction of the time that network will operate in state m, as the network use 
goes to infinity. The cut-set upper bound on the capacity of the Gaussian relay network with 
half-duplex constraint, Chd, is shown to be ll26ll 

M 

CM<CHd= ,^ max mi^n ^t,„J(y[^.;x^|x[^.)- (129) 

Theorem 8.3: The capacity Chd of the Gaussian relay network with half-duplex constraint 
satisfies 

Chd-^^< Cm < Cm (130) 

where Cm is the cut-set upper bound on the capacity as described in equation (11291) and k is 
a constant and is upper bounded by 12 ^[^'^ iV, + min{^l^l, M„ ^[^'^ iV,}, where Af, and iV, 
are respectively the number of transmit and receive antennas at node i. 

^Active link is defined as a link which is departing from the set of sender nodes 
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(a) 

A^ Ai Ai Ai 




A2 A2 A2 42 

(b) Mode 1 (c) Mode 2 (d) Mode 3 (e) Mode 4 

Fig. 19. An example of a relay network with two relays is shown in (a). All four modes of half-duplex operation of the relays 
are shown in (&) — (e). 



Proof: We prove the result for the case that nodes have single antenna; its extension to 
the multiple antenna scenario is straightforward. Since each relay can be either in a transmit or 
receive mode, we have a total of M = 2l^l"^ number of modes. An example of a network with 
two relay and all four modes of half-duplex operation of the relays are shown in Figure [191 

Consider the tj's that maximize Chd in (11291) . Assume that they are rational numbers (otherwise 
look at the sequence of rational numbers approaching them) and set W to be the LCM (least 
common divisor) of the denominators. Now increase the bandwidth of system by W and allocate 
Wti of bandwidth to mode i, i = 1 , . . . , M. Each mode is running at a different frequency band. 
Therefore, as shown in Figure [201 we can combine all these modes and create a frequency 
selective relay network. Since the links are orthogonal to each other, the cut-set upper bound 
on the capacity of this frequency selective relay network (in bits/sec/Hz) is the same as (11291) . 
By theorem 18.21 we know that our quantize-map-and-forward scheme achieves, within a constant 
gap, /t, of Cm for all channel gains. In this relaying scheme, at each block, each relay transmits 
a signal that is only a function of its received signal in the previous block and hence does not 
have memory over different blocks. We will translate this scheme to a scheme in the original 
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Fig. 20. Combination of all lialf-duplex modes of tfie network shown in figure [19] Each mode operates at a different frequency 
band. 



network that modes are just at different times (not different frequency bands). The idea is that 
we can expand exactly communication block of the frequency selective network into W blocks 
of the original network and allocating Wti of these blocks to mode i. In the Wti blocks that are 
allocated to mode i, all relays do exactly what they do in frequency band i. This is described 
in Figure [21] for the network of Figure [201 This figure shows how one communication block of 
the frequency selective network (a) is expanded over W blocks of the the original half-duplex 
network (b). Since the transmitted signal at each frequency band is only a function of the data 
received in the previous block of the frequency selective network, the ordering of the modes 
inside the W blocks of the original network is not important at all. Therefore with this strategy 
we can achieve within a constant gap, k, of the cut-set bound of the half-duplex relay network 
and the proof is complete. 

One of the differences between this strategy and our original strategy for full duplex networks 
is that now the relays might be required to have a much larger memory. In the full duplex 
scenario, in the layered case the relays had only memory over one bloclqj (what they sent was 
only a function of the previous block). However for the half-duplex scenario the relays are 
required to have a memory over W blocks and W can be arbitrarily large. 

*This could be also done in the arbitrary networks but requires an alternative analysis. See footnote in Section IV-B I 
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(a) 



sssssssss 




Mode 1 Mode 2 Mode 3 Mode 4 

(b) 



Fig. 21. One communication block of the frequency selective network (a), and its expansion over W blocks of the original 
half-duplex network (b). 



D. Quasi-static fading relay network (underspread regime) 

In a wireless environment channel gains are not fixed and can change. In this section we 
consider a typical scenario in which although the channel gains change, they can be considered 
time invariant over a long time scale (for example during the transmission of a block). This 
happens when the coherence time of the channel (Tc) is much larger than the delay spread (T^). 
Here the delay spread is the largest extent of the unequal path lengths, which is in some sense 
corresponding to inter-symbol interference. Now, depending on how fast the channel gains are 
changing compared to the delay requirements, we have two different regimes: fast fading or 
slow fading scenarios. We consider each case separately. 
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1 ) Fast fading: In the fast fading scenario the channel gains are changing much faster 
compared to the delay requirement of the application {i.e., coherence time of the channel, T^, 
is much smaller than the delay requirements). Therefore, we can interleave data and encode it 
over different coherence time periods. In this scenario, ergodic capacity of the network is the 
relevant capacity measure to look at. 

Theorem 8.4: The ergodic capacity Cergodic of the quasi-static fast fading Gaussian relay 
network satisfies 

^h., [C{{h,,})] - K < Cergodic < S^,^ \C{{h,^})] (131) 

where C is the cut-set upper bound on the capacity as described in Equation dTT] ) and the 
expectation is taken over the channel gain distribution. Also, the constant k is upper bounded 
by 12 ^[^'^iV, + min{^J^'^Mi,^J^'^ A^i}, where Mi and Ni are respectively the number of 
transmit and receive antennas at node i. 

Proof: We prove the result for the case that nodes have single antenna. Its extension to the 
multiple antenna scenario is straightforward. An upper bound is just the cut-set upper bound. 
For the achievability note that the relaying strategy we proposed for general relay networks does 
not depend on the channel realization, relays just quantize at noise level and forward through a 
random mapping. The approximation gap also does not depend on the channel parameters. As 
a result by coding data over L different channel realizations the following rate is achievable 

1=1 

Now as L — cxD, 

1 ^ _ _ 

1=1 

and the theorem is proved. ■ 

2) Slow fading: In a slow fading scenario the delay requirement does not allow us to interleave 
data and encode it over different coherence time periods. We assume that there is no channel 
gain information available at the source, therefore there is no definite capacity and for a fixed 
target rate R we should look at the outage probability, 

Vout{R)=^{C{{K,})<R} (134) 
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where the probability is calculated over the distribution of the channel gains and the e-outage 
capacity is defined as 

Ce = V^\ie). (135) 

Here is our main result to approximate the outage probability. 

Theorem 8.5: The outage probability Vout{R) of the quasi-static slow fading Gaussian relay 
network satisfies 

P {C{{K,}) <R]< VoutiR) < P {C{{K^}) <R+k} (136) 

where C is the cut-set upper bound on the capacity as described in Equation (l27l) and the 
probability is calculated over the distribution of the channel gains. The constant k is upper 
bounded by 12J2^^J^Ni +min{^[^'-,^ Mj, Y^^^^ Ni}, where Mj and Ni are respectively the number 
of transmit and receive antennas at node i. 

Proof: Lower bound is just based on the cut-set upper bound on the capacity. For the upper 
bound we use the compound network result. Therefore, based on Theorem 18.11 we know that as 
long as C{{hij}) — k < R there will not be an outage. ■ 

E. Low rate capacity approximation of Gaussian relay network 

In the low data rate regime, a constant-gap approximation of the capacity may not be useful any 
more. A more useful kind of approximation in this regime would be a universal multiplicative 
approximation, where the multiplicative factor does not depend on the channel gains in the 
network. 

Theorem 8.6: The capacity C of the Gaussian relay network satisfies 

XC <C <C (137) 

where C is the cut-set upper bound on the capacity, as described in Equation (|27]) . and A is a 
constant and is lower bounded by 2d{d+i) ' where d is the maximum degree of nodes in Q. 

Proof: First we use a time-division scheme and make all links in the network orthogonal 
to each other. By Vizing's theorem (e.g., see Il271 p. 153) any simple undirected graph can be 
edge colored with at most d + 1 colors, where d is the maximum degree of nodes in Since 
our graph Q is a directed graph we need at most 2(d + 1) colors. Therefore we can generate 
2{d+ 1) time slots and assign the slots to directed graphs such that at any node all the links are 
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orthogonal to each other. Therefore each link is used a 2(d+i) fraction of the time. We further 
impose the constraint that each of these links uses a total 2d[d+i) time, but with a factor 

of d more power. By coding we can convert each links hij into a noise free link with capacity 

1 

2d{d~+ 1] 

By Ford-Fulkerson theorem we know that the capacity of this network is 



Cm = TTITTTT^ log(l + d\hij^)- (138) 



Corthogonal > TTJTTTTV^- (140) 



Corthogonal = Hlin ^ dj (139) 

and this rate is achievable in the original Gaussian relay network. Now we will prove that 

1 

2d{d~+ 1] 

To show this, assume that in the orthogonal network each node transmits the same signal 
on its outgoing links. Furthermore, each node j takes the summation of all incoming signals 
(normalized by ^) and denotes it as its received signal yj, i.e. 

1 

VAA = -^YllhijVdxiltj + Zijlt]^ (141) 

d 
i=l 

where 

%[t] = ^^|^~CAr(0,l). (143) 

Therefore we get a network which is statically similar to the original non-orthogonal network, 
however each time-slot is only a fraction of the time slots in the original network. Therefore 
without this restriction the cut-set of the orthogonal network can only increase. Hence 

Corthogonal > 2did+lf- ^^"^"^^ 



IX. Conclusions 

In this paper we presented a new approach to analyze the capacity of Gaussian relay networks. 
We start with deterministic models to build insights and use them as the foundation to analyze 
Gaussian models. The main results are a new scheme for general Gaussian relay networks called 
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quantize-map-and-forward and a proof that it can achieve to within a constant gap to the cutset 
bound. The gap does not depend on the SNR or the channel values of the network. No other 
scheme in the literature has this property. 

One limitation of these results is that the gap grows with the number of nodes in the network. 
This is due to the noise accumulation property of the quantize-map-and-forward scheme. It is 
an interesting question whether there is another scheme that can circumvent this to achieve a 
universal constant gap to the cutset bound, independent of the number of nodes, or if this is 
an inherent feature of any scheme. In this case a better upper bound than the cutset bound is 
needed. 
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Appendix A 
Proof of Theorem 13.11 

If \hsR\ < \hsD\ then the relay is ignored and a communication rate equal to R = log(l + 
is achievable. If \hsR\ > |/isd| the problem becomes more interesting. In this case by 
using the decode-forward scheme described in BU we can achieve 

R = min (log (1 + \hsR\^) ,log (l + |/isd|' + I/^sdI')) ■ (145) 

Therefore, overall the following rate is always achievable 

i?DF = max{log(l + |/isDn,min (log (l + ,log (l + |/isDp + |/ii?Dp))}. 

Now we compare this achievable rate with the cut-set upper bound on the capacity of the 
Gaussian relay network 

C <C = maxmin{log (l + (1 - P^){\hsD\^ + \hsR\^)) , log (l + \hsD\^ + {hRol^ + 2p\hsD\\hRD 
Note that if \hsR\ > {hsnl then 

Rdf = min (log (l + \hsR\^) , log (l + {hsol^ + l/iiJDp)) (146) 
and for all |p| < 1 we have 

log (1 + (1 - P^)i\hsD\^ + \hsR\^)) < log (1 + \hsR\^) + 1 (147) 

log [1 + \hsD\^ +\hRD\^ + 2p\hsD\\hRD\) < \og [1 + \hsD\^ +\hRD\^) + I (148) 

Hence 

Rdf > C^relay - 1- (149) 

Also if \hsD\ > \hsR\, 

RDF = log{l+\hsD\^) (150) 

and 

log (1 + (1 - p^){\hsDf + \hsBf)) < log (1 + \hsDf) + 1 (151) 

therefore again, 

-Rdf > C'relay — 1- (152) 
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Appendix B 
Proof of Theorem [3^ 

The cut-set upper bound on the capacity of diamond network is 

Cdiamond < C < min{log (l + l/lSAiT + \hsA2?) , 
\og{l + {\hA,D\ + \hA,D\?) , 
\og{l + \hs A,?) +\og{l + \hA,D?). 

log(l + |/i5A.n +log(l + \hA,D?)]. (153) 

Without loss of generality assume I/isaJ > |^s'A2|- Then we have the following cases: 

1) |^5Ail < In this case 

RpDF > log(l + \hsAA') >C-1. (154) 

2) \hsAi \ > \hA^D\- 

Let a = ^5^1^, then 



RpDF = log(l + \hA.o\') + min ^og ( 1 + , log f 1 ' 



a\hsA2\^ + 1 7 ' V 1 + \hAiD 



(155) 
or 

RpDF = mm I log I a\hsA p + 1 / ' J ' 



Now if 



,_f {l+\hsA2\^){l+\hA,D\^) \ , I, |2^|, |2X n^7^ 



we have 

RpDF = \0g{l+\hA,D\^+\hA2D\^) (158) 
> \0g{l + {\hA,D\ + \hA2D\f)-l>C-l. (159) 

Therefore, the achievable rate of partial decode-forward scheme is within one bit of the 
cut-set bound. So we just need to look at the case that 

In this case consider two possibilities: 
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(1 + \hsA,\ 


\'){l + \hA,D\ 




a\hsA2\ 


P + 1 



a\hsA2\ ^ 1- Here we have 

. ,„ / (1.|^...P)(1H-|V.P) ) 

> iog(ii±Mi±E^) (162) 

= l0g(l + |/l5A.n+l0g(l+|/iA,D|')-l>^-l. (163) 

In this case we will show that 

. /(l+M(i±E-^) (,64, 

> \og{l + \hsAA^ + \hsA2n-l (165) 

> (166) 

To show this we just need to prove 

>l{l + \hsAf+\hsMn- (167) 
By replacing a = |^f^, we get 

2\hsAA\^ + \hsA2\^)il+\hA,D\^) > (1 + \hsAA^ + IhsA^l^) (|/isaJ' + \hsA2\^\hA,D\^) ■ 

But note that 

2\hsAA\l + \hsA2\'){l + |/iA,D|') - (1 + \hsAA' + |/i5A2l') (I/^SaJ' + IhsA^l'lhA.ol') 
= \hsAf + \hsM\'\hAM' + {\hsAA'\hsA2\' - \hsA2\'\hMD\') 

+ (I^SAiH^AiDp — \hsA2\'^\hAiD\'^) + ( | ^SAi H ^5^2 H ^AiD P " I^SAiH 
= + I^SAiH^AiDp + |^5A2p(l^5AiP — \hsA2\'^\hAiD\'^) 

+ |/iAiD|'(|/i5Air - 1/15^2!') + \hsAAWhsA2\^\hA,D\^ - \hsAA^) 
= \hsA^\^ + \hsA^\^\hA^D\'^ + (|/i5AiP - | /ISA2 H ( I ^^5^2 H /^AiD P - |/i5AiP + \hAw\^) > 

where the last step is true since 

I^SA.r > \hsA2\'' (168) 
1/^5.12 n/^AiD I' > \hsA,\^ (since a|/i5A2l' > !)• (169) 
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Appendix C 



Proof of Theorems |4JJand14^ 



In this appendix we prove Theorems 14.11 and I4.2[ We first generalize the encoding scheme 
to accommodate arbitrary deterministic functions of (133]) in Section IC-AI We then illustrate the 
ingredients of the proof using the same example as in Section IV-A2I The complete proof of our 
result for layered networks is proved in Section IC-C[ The extension to the non-layered case is 
very similar to the proof for linear finite-field model discussed in Section IV-B[ hence is omitted. 

A. Encoding for layered general deterministic relay network 

We have a single source S with a sequence of messages Wk G {1,2,..., 2™}, k = 1,2, 

Each message is encoded by the source S into a signal over T transmission times (symbols), 
giving an overall transmission rate of R. We will use strong (robust) typicality as defined in 
ll28l . The notion of joint typicality is naturally extended from Definition IC.li 

Definition C.l: We define x as 5-typical with respect to distribution p, and denote it by x G Ts, 



where 6 G and u^ix) = ^\{t : Xf = x}\, is the empirical frequency. 

Each relay operates over blocks of time T symbols, and uses a mapping fj : yj — > Xj from 
its previous block of received T symbols to transmit signals in the next block. In particular, 
block k of T received symbols is denoted by y^'^^ = {y[{k — 1)T + 1], . . . ,y[kT]} and the 
transmit symbols by xj'^^ Choose some product distribution YlievPi^^)- source S, map 

each of the indices in Wk G {1,2, .. . ,2^^}, choose fs{wk) onto a sequence uniformly drawn 
from Ts{xs), which is the typical set of sequences in Xg. At any relay node j choose fj to map 
each typical sequence in Ts{yj) onto the typical set of transmit sequences Ts{xj), as 



where fj is chosen to map uniformly randomly each sequence in Ts{yf) onto T^(xj). Each relay 
does the encoding prescribed by (11701) . 



if 



^x{x) — p{x)\ < Sp{x), Wx 





(170) 
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B. Proof illustration 

Now, we illustrate the ideas behind the proof of Theorem 14.11 for layered networks using the 
same example as in Section IV-A21 which was done for the linear deterministic model. Since we 
are dealing with deterministic networks, the logic up to (l47l) in Section IV-A2I remains the same. 
We will again illustrate the ideas using the cut = {S, Ai, Bi}. As in Section rV-A2[ we can 
write 

V = F{A2,B2,V,A'i,B'i} (171) 
= ¥{A2} xF{B2,A'i\A2} xF{V,B'i\A2,B2,Al} (172) 
< F{A2} X F{B2\Al,A2} X F{V\Bl, A2, B2, A^} (173) 
= F{A2} xF{B2\A'i,A2} xF{V\Bl,B2} (174) 

where the events {^1, A2,Bi,B2, V} are defined in (l48l) . and the last step is true since there is 
an independent random mapping at each node and we have a Markovian layered structure in the 
network. 

Note that since G Ts{yj) with high probability, we can focus only on the typical received 
signals. Let us first examine the probability that y^ijl^) = yyi2(^')- Since S can distinguish 
between w, w', it maps these messages independently to two transmitted signals :>ls{u!),xs{w') E 
Ts{xs), hence we can see that 

F{A2} = F{{^s{w'),y^^iw)) e Ts{xs,yA,)} = 2-"(^^^^^2), (175) 

where = indicates exponential equality (where we neglect subexponential constants). 

Now, in order to analyze the second probability, as seen in the linear model analysis, A2 
implies x^2('"^) = x^2(^')' same signal is sent under both w,w'. Therefore, since 

(xA2(u)),y52(^)) ^ Ts{xA2,yB2), obviously, (xa2(w'')) yi?2(^)) ^ Ts{xA2,yB2) as well. There- 
fore, under w', we already have x^2(u;') to be jointly typical with the signal that is received 
under w. However, since Ai can distinguish between w, w', it will map the transmit sequence 
^Aiiw') to a sequence which is independent of xai{w) transmitted under w. Since an error 
occurs when {-KA^{w'),XA2i'w'),y^^{w)) G Ts{xai, XAi^yBi)^ and since A2 cannot distinguish 
between w,w', we also have x^2(u;) = XA2iw'), we require that (x^^ , x^^ , y ) generated like 
p(xyijp(xyi2, y^^) behaves like a jointly typical sequence. Therefore, this probability is given 
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by 

P{i32|^l,^2} = P{(xAi(w''))XA2(w),yB2(w^)) ^TsixA^^XA^VBi)} = 

2-TI{xA^;yB2<^A2) 2~-^-'^(^^l'^-B2l^A2)^ (176) 

where (a) follows since we have generated the mappings fj independently, it induces an in- 
dependent distribution on xai , xa2 ■ Another way to see this is that the probability (11761) is 
^^'^'''^T^lx^^if which by using properties of (robustly) typical sequences [!28l yields the same 
expression as in (11761) . Note that the calculation in (11761) is similar to one of the error event 
calculations in a multiple access channel. 
Using a similar logic we can write 

F{V\B'i,B2} = P{(xB,(i/;'),XB,H,yBH) e Ts{xB,,XB2yD)} = 

2-Ti{xB-^;yD,XB2) 2~-'"^(^si'J'^l^s2). (177) 

Therefore, putting (|175l) - (|177l) together as done in (|56l ) we get 

-p < 2~'^^^^^^'y^2'l+^^^Al'yB2\^A2)+H^Bi;yD\xB2)} 

Note that, for this example, due to the Markovian structure of the network we can see thaj^ 

I{ync; xqIxqc) = I{xs; 2/A2) + ^(a^Ai; J/Sal^^Aa) + I{xBi]yD\xB2), hence as in dST]) we get 

< 2^^|A£)|2^"^'™'^"^'^D (178) 
and hence the error probability can be made as small as desired if i? < minneA^ H (yQc\xnc) . 

C. Proof of Theorems \4.1\ and \4.2\ for layered networks 

As in the example illustrating the proof in Section IC-Bl the logic of the proof in the general 
deterministic functions follows that of the linear model quite closely. 
For any such cut f2, define the following sets: 

• LiiVt): the nodes that are in Vt and are at layer /, (for example 5" G LiiVt)), 

'Though in the encoding scheme there is a dependence between xai , , a^fli , 2^-82 ^s, in the single-letter form of the 
mutual information, under a product distribution, xa^ , xa^ ,xbi,x:b2j^s are independent of each other. Therefore for example, 
yB2 is independent of XB2 leading to _ff (i/Sj |xa2 , 3;b2 ) = H{yB2\x:A2)- Using this argument for the cut-set expression 
I{ysi':;xn\xn'i), we get the expansion. 
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• Ri{fl): the nodes that are in Q"^ and are at layer I, (for example D G Ri^{fl)). 

As in Section IV-AI we can define the bi-partite network associated with a cut Q. Instead of a 
transfer matrix Gq^qc(-) associated with the cut, we have a transfer function Gf^. Since we are 
still dealing with a layered network, as in the linear model case, this transfer function breaks 
up into components corresponding to each of the Id layers of the network. More precisely, we 
can create d = Id disjoint sub-networks of nodes corresponding to each layer of the network, 
with the set of nodes L/_i(r2), which are at distance / — 1 from S and are in Vt, on one side 
and the set of nodes Ri{Vl), which are at distance I from S that are in fi^, on the other side, 
for / = 2, . . . , Each of these clusters have a transfer function Gi(-), / = 1, . . . , /^^ associated 
with them. 

As in the linear model, each node i sees a signal related to w = wi in block U = / — 1, 
and therefore waits to receive this block and then does a mapping using the general encoding 
function given in (I170|) as 

^f\nj) = f^'\yf^'\w)). (179) 

The received signals in the nodes j G are deterministic transformations of the transmitted 

signals from nodes Ti = {u : {u, v) E £ , v E Ri{^l)}. As in the linear model analysis of Section 
IV-A[ the dependence is on all the transmitting signals at distance / — 1 from the source, not 
just the ones in Li{il). Since all the receivers in Ri{il) are at distance / from S, they form the 
receivers of the layer /. 

We now define the following events: 

• Event that the nodes in Li can distinguish between w and w', i.e. y^{w) ^ YliW)^ 

• IZi. Event that the nodes in Ri can not distinguish between w and w\ i.e. y^^ (w) = y^^ {w'). 
Similar to Appendix IC-BI we can write 

V = P{7^^A_l,/ = 2,...,/,,} (180) 

Id 

= J]P{7^^/:^_l|7^„/:,_l,J = 2,...,/-l} (181) 

1=2 

Id Id 

< J]P{7^^|7^,•,£,•,i = 2,...,/-l} = J]P{7^^|7^^_l,£^_l}. (182) 

1=2 1=2 

Note that for all the transmitting nodes in which cannot distinguish between w,w' the 
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transmitted signal would be the same under both w and w' , i.e. 

Xj{w) = Xjiw'), j e Ri^i. 
Therefore, since {{xj{w)}j(zR^_^,yj^^{w)) E Ts, we have that 

({xjK)}ieifi_i,yi?,H) e Ts. 
Therefore, just as in Appendix IC-B[ we see that 

P{7^^|7^^_l,£z_l} = F{{xL^_^{w'),XR^_^{w),yji^{w))eTs{xLi_,,XR^_^,yR^)} (183) 
^ 2-"(^i'i-i'J'«il^'«i-i). (184) 

Therefore 

d 

-p < J^2""^''^'-i'^^''^^'-i^ = 2"'^^''=2'^(^«il^^i-i\ (185) 

1=2 

Due to the Markovian nature of the layered network, Yl'i=2^(yRi\-^Ri-i) ~ ^iUnA^n'^)- From 
this point the proof closely follows the steps from (11781) onwards. Similarly, in a multicast 
scenario we declare an error if any receiver D E V makes an error. Since we have 2^^ messages, 
from the union bound we can drive the error probability to zero if we have 

R < max min min H(y^c\xQc). (186) 

Appendix D 
Proof of Lemma [63] 
Consider the SVD decomposition of H: H = USV^ with singular values ai, . . . , crmin(m,n)- 
Let us define K = min{m,n} and Xj = [xij,--- ,Xm,j], which is i.i.d. (over 1 < j < T) 
CA/'(0,I„). 

Therefore, if ||Hxj||oo < V^, then ||SVxj||2 < \/2K, which means, 

p|||Hij||oo < v^} < p|||SVij||2 < y2K} (187) 

|||Eij||2 < 72^} (188) 



= P|||Exj||2 < V2K 
where the last step is true since the distribution of x and Vx are the same 
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Now by using (11881) . we get 
P{V1 < J <T: ||H[£i,,-,--- ,imMoo < V2} <P{V1 < J < T: ,5^mj]*||2 < v^} 

{min{m,n} | 
VI < i < T : ^ l^ijf < I (189) 

T I min{m,n} | 

= n^1 5Z ^'I^Mf<2A-^ (190) 



j=i y i=i 

n 

where (a) follows from the Chernoff bounq^. 

Since, log(l + a2) = logdet(I + HH*) = /(x; Hx + z), for x ~ CU{Q, I^), z ~ 

CA/'(0,Iri), we get the desired result. 

Appendix E 
Proof of Lemma [631 

We first prove the following lemmas. 

Lemma E.l: Consider integer- valued random variables x, r and s such that 

X ± r (192) 
s G {-L,...,0,...,L} (193) 
P{|r| > A;} < e-f'^^\ for all G Z+ (194) 
for some integer L and a function /(.). Let 

y = X + r + s (195) 

Then 

/ 00 \ o r 1 

i7(?/|a;) < 2 log^ e /(A;)e-^W + + iV; (196) 



vfc = l 



i/(x|y) < log (2L + 1) + 2 log2 e ij^ + + Nf (197) 

'"We would like to acknowledge useful discussions with A. Ozgur on sharpening the proof of this result. It is also related to 
the proof technique in 1201 . 
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where 



Nf 



{n e Z+|e-^(") > i} 



Proof: By definition we have 

H{y\x) = H{x + r + s\x) = H{r + s\x) 

< H{r + s) = -^F{r + s = k}\ogF{r + s = k} . 

k 

Now since —plogp < \ for < p < 1, we have 

^ 2L + 1 
- ^ P{r + s = A;}logP{r + s = k} < . 

k=~L 

For |A;| > L we have 

P{r + s = k} < P{|r| > \k\ -L}< e--^(l*^l-^). 
Since plogp is decreasing in p for p < | we have 



^ P{r + s = A;}logP{r + s = /c} 



P{r + s = A;}logP{r + s 



fc=L+l 



k>L 
k-L&Nf 



P{r + s = fc}logP{r + s 



k-L^Nf 



N 



< -1+ J2 e-^('=-^)/(A;-L)loge 



fc=L+l 



and similarly 

- ^ P{r + s = A;}logP{r + s = A;} 



P{r + s = A;}logP{r + s 



fc=— oo 



fc<-L 
\k\-L(^Nf 



P{r + s = A;}logP{r + s 



/c<-L 
|fc|-L^JV; 

< ^+ E e-^(^-^)/(fc-L)loge. 

fc=L+l 



By combining (I201L (12041) and (12061) we get 

/ oo 

H{y\x) <2log,e I J2fiky 



-m 



2L + 1 



+ Nf. 



.fc=i 
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Now we prove the second inequality: 

H{x\y) = H{x\x + r + s) = H{x) — I{x; X + r + s] 



= H{x 

< H{x 
= H{x 
= H{x 

< H{x 

< H{x 
= H{x 



— H{x + r + s) + H{x + r + s\x) 

— H{x + r + s\s) + H{y\x) 

— H{x + r\s) + H{y\x) 

— H{x + r) + I{x + r; s) + H{y\x) 

— H{x + r) + H{s) + H{y\x) 

— H{x + r\r) + log (2L + 1) + H{y\x) 

— H{x) + log {2L + 1) + H{y\x) 



Therefore 



log {2L+1) + H{y\x] 



H{x\y) < log (2L + 1) + 2 log^ e ^ f{k)t 

\k=l 



2L + 1 



Corollary E.2: Assume w is a continuous complex random variable, then 

Hi[v + z]\\[v]) < 12 
Hi[v]\\[v + z]) < 12 



(208) 
(209) 
(210) 
(211) 
(212) 
(213) 
(214) 
(215) 
(216) 

(217) 



(218) 
(219) 



where z is a CJ\f{0, 1) random variable independent of v and [.] is defined in Definition 16.11 
Proof: We use lemma IE. II with variables 



X 



[Rc{v)] 
r = [Re(^)] 
s = [{Re(t;)} + {Re(2)}] 

Then L = 1 and since 

P{|[Re(z)]| > A;} < p| \[Re{z)] ^ 



->n=2Q(fc--) 



< e" 



(220) 
(221) 
(222) 

(223) 
(224) 
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We can use 



Also since 



m = (225) 



(k-jf 1 

e — < -, for A; > 2 (226) 

2 



we have Nf = 1. Hence 



o r I 1 

log (2L + 1) + 2 log2 e ( J2 ) + + iV/ (227) 



As a result 



2 log2 e I ^ e — j + 2.5 + log2 3 (228) 

5.89 < 6. (229) 

H{[Re{v + z)]\\[Re{v)]) < 6 (230) 

H{[Re{v)]\\[Re{v + z)]) < 6 (231) 



Similarly 



H{[lm{v + z)]\\[lm{v)]) < 6 (232) 

H{[lm{v)]\\[lm{v + z)]) < 6 (233) 

Therefore 

H{[v + z]\\[v]) < H{[Re{v + z)]\\[Ro{v)]) + H{[lm{v + z)]\\[lm{v)]) < 12 (234) 

H{[v]\\[v + z]) < iy([Re(t;)]||[Re(^; + ^)]) + //([Im(^;)]||[Im(t; + ^)]) < 12 (235) 
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H{[y^]\u,Fv)] < H{[y^]\w',Fv) (236) 

Id 

E^([yvJI[yv._.>^v]) (237) 



1=2 
Id 



^//([yvJ|xv,_,,Fv) (238) 



1=2 
Id 



5^//([Re(yvJ]|xv,_,,Fv) + i/([Im(yvJ]|xv,_,,Fv) (239) 

1=2 

Corollary El] J^. 

< ^12r|Vi| (240) 

1=2 

12T\V\. (241) 

Appendix F 
Proof of Lemma [6T6] 

First note that Cq is the capacity of the MIMO channel that the cut Q creates. Therefore 
intuitively we want to prove that the gap between the capacity of a MIMO channel and its 
capacity when it is restricted to have equal power allocation at the transmitting antennas is 
upper bounded by a constant. Therefore without loss of generality we just focus an n x m 
MIMO channel, with K = min{m, n}, 

y" = Gx"* + z" (242) 

with average transmit power per antenna equal to P and i.i.d complex normal noise. We know 
that the capacity of this MIMO channel is achieved with water filling, and 

K 

C = C^f = J2 + ^-^*) (243) 
1=1 

where A^'s are the singular values of G and Qa is given by water filling solution satisfying 

K 

J2 Qii = ^P- (244) 

i=l 

Now with equal power allocation we have 

K 

Cep = ^log(l + PAO. (245) 

i=l 
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Now note that 



n£imax(l,PAj 



C.,-C,, = log ( nil<i±%^ 1 (246) 

K 



v«=l 



< log n i+fe =i°Hn i+T? p^o) 



vi=l \ / / \j=l 



Now note that 



K 



J](l + %) = K + m (251) 



P 



1=1 

and therefore by arithmetic mean-geometric mean inequality we have 

K 



and hence 



K 

C^f - Cep < Klog(l + —) <K = mm{m, n}, (253) 

Therefore the loss from restricting ourselves to use equal transmit powers at each antenna of an 
m X n MIMO channel is at most min{m, n} bits. 

Now, lets apply (12531) to prove Lemma 16.61 Note that the cut-set upper bound of (|T7l) when 
applied to the Gaussian network yields, 

C= max min log|I + GnQGol, (254) 

where Gq represents the network transfer matrix from transmitting set to receiving set fi'^. 
The maximization in (|27l) can be restricted to jointly Gaussian inputs represented by covariance 
matrix Q with individual power constraints. Now, clearly these constraints can be relaxed to the 
sum-power constraints yielding, 

C= max min log |I + G^QG^I < min max log |I + Gj^QG^I = min C^. 



Q:Q,,<P,Vif^eAB ^^^D Q.,tr{Q)<\Q\p 



neAr> 

(255) 
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Now, let us define C^^ , to be the cut value for i.i.d. Gaussian inputs, i.e., Q = I. More precisely, 
from Definition 16.21 we have 

= minlog |I + PGnG^I = min C^''. (256) 

By using (12531) . we get, 

C^-C""^ < min{|fi|,|fi'=|} < |V|, Vfi, (257) 

Since rninCj^ < rninC^'^ + |V|, we get the claimed result in Lemma \6M for the scalar case. 

For the case with multiple antennas, we see that for any cut Cl, the number of degrees of 
freedom is mm{Y,i^n^i^T.ien- ^i}- Note that, maxnmm{^.^^ Mi, Ni} < Xll=i M 

and maxn mm{J2i^n T^ien- ^i} ^ E'=i ^nd hence max^ minjXlign Mi, Y^ianc Ni} < 
min{El='i M„ N,} yielding 

|V| |V| 

min{J]M„5];iV,}<min{J]M,,J]iV,}, Vfi. (258) 

iefl ieW i=l i=l 

For a trivial upper bound to use in an argument analogous to (I257|) . we can use (12581) to see that 

_ _ |V| |V| 

Cn <C;;' + min{^M,,^iV,}. (259) 

1=1 1=1 

Appendix G 
Proof of Lemma [772] 

We first prove the following two lemmas: 

Lemma G.l: Let G be the channel gains matrix of a m x n MIMO system. Assume that there 
is an average power constraint equal to one at each node. Then for any input distribution Px^ 

|/(x; [C^x + z]) - /(x; [G^])\ < 12n (260) 

where z = [zi, . . . , Zn] is a vector of n i.i.d. CJ\f{0, 1) random variables. 

Lemma G.l: Let G be the channel gains matrix of a m x n MIMO system. Assume that there 
is an average power constraint equal to one at each node. Then for any input distribution Px, 

|J(x;G'x + z) - /(x; [C;x + z])| < 7n (261) 
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where z = [zi, . . . , z„] is a vector of n i.i.d. CJ\f{0, 1) random variables. 

Note that Lemma 17.21 is just a corollary of these two lemmas, which are proved next. 
Proof: (proof of Lemma IG.ll ) 
First note that 

/(x; [Gx]) < /(x; [Gx + z]) + /(x; [Gx] | [Gx + z]) 

/(x;[Gx + z]) + //([Gx]|[Gx + z]) 



(Corollary [E^) 

< /(x; [Gx + z]) + 12ri. (262) 

J(x; [Gx + z]) < J(x; [Gx]) + J(x; [Gx + z] | [Gx]) 

< /(x;[Gx]) + /7([Gx + z]|[Gx]) 

(Corollary |R2) 

< /(x; [Gx]) + 12n. (263) 
Now from equations (|262l) and (12631) we have 

|/(x; [Gx + z]) -/(x; [Gx])| < 12n. (264) 

■ 

Proof: (proof of Lemma IG.21 ) 

Define the following random variables: 

y = Gx + z (265) 

y = [Gx + z] (266) 

y = y + u (267) 

where u = [ui, . . . , u„] is a vector of n i.i.d. complex variables with distribution uniform[0, 1] 
on both real and complex components, independent of x and z. 

By the data processing inequality we have /(x; y) > /(x; y) > /(x; y). Now, note that 

/(x;y)-/(x;y) = /,(y) - /i(y) + /,(y|x) - /i(y|x) (268) 

= h{y)-h{y) + h{y\^)-n\ogine) (269) 

= /i(y|y)-My|y) + My|x)-^log(7re) (270) 

= /i(y|y)-Mu) + My|x)-nlog(7re) (271) 

= /i(y|y) + My|x)-nlog(7re) (272) 
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where the last step is true since h{u) = nh{ui) = 2nlog 1=0. Now 



and similarly 



|Re(y) — Re(?/)| < max (|[Re(x)] — Re(2;)|) + max |Re(M) 



|Im(?/) — Im(?/)| < max(|[Im(x)] — Im(x)|) + max |Im(M) 

x-eC 



3 
2 

3 
2 



Therefore 



^(y|y) = ^(y-y|y) 



< nlog (^27revmax (|Re(?/) — Re(y)|) max {\lm{y) — lm{y)\) 
= nlog37re. 

For the second term, lets look at the z-th element of y 

jji = [giX + Zi] + Ui = g^x + Zi + 6{giX + Zi) + Ui 
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where yi is the i-th component of y, gj is the z-th row of G, and 5{x) = x — [x]. Clearly 
|Re(5(x))|, |Im(5(x))| < | for all x E C Therefore given x the variance of yi is bounded by 

Var [Re(y,)|x] = Var [Re{zi) + Re(<5(g,x + Zi)) + Re(n,)] 

< Var [Re{zi)] + Var [Re((^(g^x + Zi))\x\ + 2Cov [Re(^i), Re(5(giX + Zi))\x] + Var [Re(M)] 



< Var[Re(^i)] + | maxRe(5(.))|^ + 2v/Var [Re(zi)] x | maxRe(5(.))| + Var [Re(zii]) 

11 



111 
2 + 4 + ^ + T2 = ¥ 



Similarly 



Therefore 



Var[Im(y,)|x] < 



11 
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J279t 



/i(y|x) < ^/i(2/i|x) < ^log27rey^|%|x| < nlogyvre. 

i=l i=l 

Now from Equations (l272l) . (12771) and (I28TT) we have 

/(x;y)-J(x;y) < /z(y|y) + /i(y|x) - log (2^e) 



n 



< nlog IIttc ~ 6.55n < 7n. 
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